Mining Big Data in biomedicine and health care.
J Biomed Inform. 2016 Oct;63:400-403
Authors: Fodeh S, Zeng Q
PMID: 27670091 [PubMed - in process]
Healthcare hashtag index development: Identifying global impact in social media.
J Biomed Inform. 2016 Oct;63:390-399
Authors: Pinho-Costa L, Yakubu K, Hoedebecke K, Laranjo L, Reichel CP, Colon-Gonzalez MD, Neves AL, Errami H
PURPOSE: Create an index of global reach for healthcare hashtags and tweeters therein, filterable by topic of interest.
MATERIALS AND METHODS: For this proof-of-concept study we focused on the field of Primary Care and Family Medicine. Six hashtags were selected based on their importance, from the ones included in the 'Healthcare Hashtag Project'. Hashtag Global Reach (HGR) was calculated using the additive aggregation of five weighted, normalized indicator variables: number of impressions, tweets, tweeters, user locations, and user languages. Data were obtained for the last quarter of 2014 and first quarter of 2015 using Symplur Signals. Topic-specific HGR were calculated for the top 10 terms and for sets of quotes mapped after a thematic analysis. Individual Global Reach, IGR, was calculated across hashtags as additive indexes of three indicators: replies, retweets and mentions.
RESULTS: Using the HGR score we were able to rank six selected hashtags and observe their performance throughout the study period. We found that #PrimaryCare and #FMRevolution had the highest HGR score in both quarters; interestingly, #FMChangeMakers experienced a marked increase in its global visibility during the study period. "Health Policy" was the commonest theme, while "Care", "Family" and "Health" were the most common terms.
DISCUSSION: This is the first study describing an altmetric hashtag index. Assuming analytical soundness, the Index might prove generalizable to other healthcare hashtags. If released as a real-time business intelligence tool with customizable settings, it could aid publishing and strategic decisions by netizens, organizations, and analysts. IGR could also serve to augment academic evaluation and professional development.
CONCLUSION: Our study demonstrates the feasibility of using an index on the global reach of healthcare hashtags and tweeters.
PMID: 27645323 [PubMed - in process]
SPNsim: A database of simulated solitary pulmonary nodule PET/CT images facilitating computer aided diagnosis.
J Biomed Inform. 2016 Oct;63:357-365
Authors: Tzanoukos G, Athanasiadis E, Gaitanis A, Georgakopoulos A, Chatziioannou A, Chatziioannou S, Spyrou G
The aim of the present work was to design and develop a database of simulated solitary pulmonary nodules (SPN) in pairs of computed tomography (CT) and positron emission tomography (PET) images, using Monte Carlo (MC) simulation methods. We have developed an SPN image modeling pipeline to feed the database entitled SPNsim. The database is web-accessible and it is contains two subsets of simulated PET/CT SPN images. The first subset is currently composed of 1000 cases containing pairs of the transaxial CT and the corresponding PET slice with various types of simulated SPNs, presented as individual records. The second subset contains pairs of the transaxial CT and the corresponding PET slice of simulated SPNs, presenting cases of graded difficulty in diagnosis. The users of the database will have the ability to set queries in order to retrieve cases with certain characteristics, as well as characterized image sets. All images are freely available and may be downloaded from the website. SPNsim provides a useful reference data set for training and evaluation of computer aided detection (CAD) and diagnosis (CADx) systems focusing on SPN.
PMID: 27623536 [PubMed - in process]
Health at hand: A systematic review of smart watch uses for health and wellness.
J Biomed Inform. 2016 Oct;63:269-276
Authors: Reeder B, David A
INTRODUCTION: Smart watches have the potential to support health in everyday living by: enabling self-monitoring of personal activity; obtaining feedback based on activity measures; allowing for in-situ surveys to identify patterns of behavior; and supporting bi-directional communication with health care providers and family members. However, smart watches are an emerging technology and research with these devices is at a nascent stage.
METHODS: We conducted a systematic review of smart watch studies that engaged people in their use by searching PubMed, Embase, IEEE XPlore and ACM Digital libraries. Participant demographics, device features, watch applications and methods, and technical challenges were abstracted from included studies.
RESULTS: Seventy-three studies were returned in the search. Seventeen studies published were included. Included studies were published from 2014 to 2016, with the exception of one published in 2011. Most studies employed the use of consumer-grade smart watches (14/17, 82%). Patient-related studies focused on activity monitoring, heart rate monitoring, speech therapy adherence, diabetes self-management, and detection of seizures, tremors, scratching, eating, and medication-taking behaviors. Most patient-related studies enrolled participants with few exclusion criteria to validate smart watch function (10/17, 58%). Only studies that focused on Parkinson's disease, epilepsy, and diabetes management enrolled persons living with targeted conditions. One study focused on nursing work in the ICU and one focused on CPR training for laypeople.
CONCLUSION: Consumer-grade smart watches have penetrated the health research space rapidly since 2014. Smart watch technical function, acceptability, and effectiveness in supporting health must be validated in larger field studies that enroll actual participants living with the conditions these devices target.
PMID: 27612974 [PubMed - in process]
GIST 2.0: A scalable multi-trait metric for quantifying population representativeness of individual clinical studies.
J Biomed Inform. 2016 Oct;63:325-336
Authors: Sen A, Chakrabarti S, Goldstein A, Wang S, Ryan PB, Weng C
The design of randomized controlled clinical studies can greatly benefit from iterative assessments of population representativeness of eligibility criteria. We propose a multi-trait metric - GIST 2.0 that can compute the a priori generalizability based on the population representativeness of a clinical study by explicitly modeling the dependencies among all eligibility criteria. We evaluate this metric on twenty clinical studies of two diseases and analyze how a study's eligibility criteria affect its generalizability (collectively and individually). We statistically analyze the effects of trial setting, trait selection and trait summarizing technique on GIST 2.0. Finally we provide theoretical as well as empirical validations for the expected properties of GIST 2.0.
PMID: 27600407 [PubMed - in process]
Automated learning of domain taxonomies from text using background knowledge.
J Biomed Inform. 2016 Oct;63:295-306
Authors: Hoxha J, Jiang G, Weng C
In this paper, we present an automated method for taxonomy learning, focusing on concept formation and hierarchical relation learning. To infer such relations, we partition the extracted concepts and group them into closely-related clusters using Hierarchical Agglomerative Clustering, informed by syntactic matching and semantic relatedness functions. We introduce a novel, unsupervised method for cluster detection based on automated dendrogram pruning, which is dynamic to each partition. We evaluate our approach with two different types of textual corpora, clinical trials descriptions and MEDLINE publication abstracts. The results of several experiments indicate that our method is superior to existing dynamic pruning and the state-of-art taxonomy learning methods. It yields higher concept coverage (95.75%) and higher accuracy of learned taxonomic relations (up to 0.71 average precision and 0.96 average recall).
PMID: 27597572 [PubMed - in process]
A Part-Of-Speech term weighting scheme for biomedical information retrieval.
J Biomed Inform. 2016 Oct;63:379-389
Authors: Wang Y, Wu S, Li D, Mehrabi S, Liu H
In the era of digitalization, information retrieval (IR), which retrieves and ranks documents from large collections according to users' search queries, has been popularly applied in the biomedical domain. Building patient cohorts using electronic health records (EHRs) and searching literature for topics of interest are some IR use cases. Meanwhile, natural language processing (NLP), such as tokenization or Part-Of-Speech (POS) tagging, has been developed for processing clinical documents or biomedical literature. We hypothesize that NLP can be incorporated into IR to strengthen the conventional IR models. In this study, we propose two NLP-empowered IR models, POS-BoW and POS-MRF, which incorporate automatic POS-based term weighting schemes into bag-of-word (BoW) and Markov Random Field (MRF) IR models, respectively. In the proposed models, the POS-based term weights are iteratively calculated by utilizing a cyclic coordinate method where golden section line search algorithm is applied along each coordinate to optimize the objective function defined by mean average precision (MAP). In the empirical experiments, we used the data sets from the Medical Records track in Text REtrieval Conference (TREC) 2011 and 2012 and the Genomics track in TREC 2004. The evaluation on TREC 2011 and 2012 Medical Records tracks shows that, for the POS-BoW models, the mean improvement rates for IR evaluation metrics, MAP, bpref, and P@10, are 10.88%, 4.54%, and 3.82%, compared to the BoW models; and for the POS-MRF models, these rates are 13.59%, 8.20%, and 8.78%, compared to the MRF models. Additionally, we experimentally verify that the proposed weighting approach is superior to the simple heuristic and frequency based weighting approaches, and validate our POS category selection. Using the optimal weights calculated in this experiment, we tested the proposed models on the TREC 2004 Genomics track and obtained average of 8.63% and 10.04% improvement rates for POS-BoW and POS-MRF, respectively. These significant improvements verify the effectiveness of leveraging POS tagging for biomedical IR tasks.
PMID: 27593166 [PubMed - in process]
Open source platform for collaborative construction of wearable sensor datasets for human motion analysis and an application for gait analysis.
J Biomed Inform. 2016 Oct;63:249-258
Authors: Llamas C, González MA, Hernández C, Vegas J
Nearly every practical improvement in modeling human motion is well founded in a properly designed collection of data or datasets. These datasets must be made publicly available for the community could validate and accept them. It is reasonable to concede that a collective, guided enterprise could serve to devise solid and substantial datasets, as a result of a collaborative effort, in the same sense as the open software community does. In this way datasets could be complemented, extended and expanded in size with, for example, more individuals, samples and human actions. For this to be possible some commitments must be made by the collaborators, being one of them sharing the same data acquisition platform. In this paper, we offer an affordable open source hardware and software platform based on inertial wearable sensors in a way that several groups could cooperate in the construction of datasets through common software suitable for collaboration. Some experimental results about the throughput of the overall system are reported showing the feasibility of acquiring data from up to 6 sensors with a sampling frequency no less than 118Hz. Also, a proof-of-concept dataset is provided comprising sampled data from 12 subjects suitable for gait analysis.
PMID: 27593165 [PubMed - in process]
Stress modelling and prediction in presence of scarce data.
J Biomed Inform. 2016 Oct;63:344-356
Authors: Maxhuni A, Hernandez-Leal P, Sucar LE, Osmani V, Morales EF, Mayora O
OBJECTIVE: Stress at work is a significant occupational health concern. Recent studies have used various sensing modalities to model stress behaviour based on non-obtrusive data obtained from smartphones. However, when the data for a subject is scarce it becomes a challenge to obtain a good model.
METHODS: We propose an approach based on a combination of techniques: semi-supervised learning, ensemble methods and transfer learning to build a model of a subject with scarce data. Our approach is based on the comparison of decision trees to select the closest subject for knowledge transfer.
RESULTS: We present a real-life, unconstrained study carried out with 30 employees within two organisations. The results show that using information (instances or model) from similar subjects can improve the accuracy of the subjects with scarce data. However, using transfer learning from dissimilar subjects can have a detrimental effect on the accuracy. Our proposed ensemble approach increased the accuracy by ≈10% to 71.58% compared to not using any transfer learning technique.
CONCLUSIONS: In contrast to high precision but highly obtrusive sensors, using smartphone sensors for measuring daily behaviours allowed us to quantify behaviour changes, relevant to occupational stress. Furthermore, we have shown that use of transfer learning to select data from close models is a useful approach to improve accuracy in presence of scarce data.
PMID: 27592309 [PubMed - in process]
Identification, analysis, and interpretation of a human serum metabolomics causal network in an observational study.
J Biomed Inform. 2016 Oct;63:337-343
Authors: Yazdani A, Yazdani A, Samiei A, Boerwinkle E
Untargeted metabolomics, measurement of large numbers of metabolites irrespective of their chemical or biologic characteristics, has proven useful for identifying novel biomarkers of health and disease. Of particular importance is the analysis of networks of metabolites, as opposed to the level of an individual metabolite. The aim of this study is to achieve causal inference among serum metabolites in an observational setting. A metabolomics causal network is identified using the genome granularity directed acyclic graph (GDAG) algorithm where information across the genome in a deeper level of granularity is extracted to create strong instrumental variables and identify causal relationships among metabolites in an upper level of granularity. Information from 1,034,945 genetic variants distributed across the genome was used to identify a metabolomics causal network among 122 serum metabolites. We introduce individual properties within the network, such as strength of a metabolite. Based on these properties, hypothesized targets for intervention and prediction are identified. Four nodes corresponding to the metabolites leucine, arichidonoyl-glycerophosphocholine, N-acyelyalanine, and glutarylcarnitine had high impact on the entire network by virtue of having multiple arrows pointing out, which propagated long distances. Five modules, largely corresponding to functional metabolite categories (e.g. amino acids), were identified over the network and module boundaries were determined using directionality and causal effect sizes. Two families, each consists of a triangular motif identified in the network had essential roles in the network by virtue of influencing a large number of other nodes. We discuss causal effect measurement while confounders and mediators are identified graphically.
PMID: 27592308 [PubMed - in process]
Confronting systemic challenges in interoperable medical device safety, security & usability.
J Biomed Inform. 2016 Oct;63:226-234
Authors: Samaras EA, Samaras GM
PMID: 27586864 [PubMed - in process]
Modelling assistive technology adoption for people with dementia.
J Biomed Inform. 2016 Oct;63:235-248
Authors: Chaurasia P, McClean SI, Nugent CD, Cleland I, Zhang S, Donnelly MP, Scotney BW, Sanders C, Smith K, Norton MC, Tschanz J
PURPOSE: Assistive technologies have been identified as a potential solution for the provision of elderly care. Such technologies have in general the capacity to enhance the quality of life and increase the level of independence among their users. Nevertheless, the acceptance of these technologies is crucial to their success. Generally speaking, the elderly are not well-disposed to technologies and have limited experience; these factors contribute towards limiting the widespread acceptance of technology. It is therefore important to evaluate the potential success of technologies prior to their deployment.
MATERIALS AND METHODS: The research described in this paper builds upon our previous work on modelling adoption of assistive technology, in the form of cognitive prosthetics such as reminder apps and aims at identifying a refined sub-set of features which offer improved accuracy in predicting technology adoption. Consequently, in this paper, an adoption model is built using a set of features extracted from a user's background to minimise the likelihood of non-adoption. The work is based on analysis of data from the Cache County Study on Memory and Aging (CCSMA) with 31 features covering a range of age, gender, education and details of health condition. In the process of modelling adoption, feature selection and feature reduction is carried out followed by identifying the best classification models.
FINDINGS: With the reduced set of labelled features the technology adoption model built achieved an average prediction accuracy of 92.48% when tested on 173 participants.
CONCLUSIONS: We conclude that modelling user adoption from a range of parameters such as physical, environmental and social perspectives is beneficial in recommending a technology to a particular user based on their profile.
PMID: 27586863 [PubMed - in process]
|(image) (image) (image)||Related Articles|
Using uncertain data from body-worn sensors to gain insight into type 1 diabetes.
J Biomed Inform. 2016 Oct;63:259-268
Authors: Heintzman N, Kleinberg S
The amount of observational data available for research is growing rapidly with the rise of electronic health records and patient-generated data. However, these data bring new challenges, as data collected outside controlled environments and generated for purposes other than research may be error-prone, biased, or systematically missing. Analysis of these data requires methods that are robust to such challenges, yet methods for causal inference currently only handle uncertainty at the level of causal relationships - rather than variables or specific observations. In contrast, we develop a new approach for causal inference from time series data that allows uncertainty at the level of individual data points, so that inferences depend more strongly on variables and individual observations that are more certain. In the limit, a completely uncertain variable will be treated as if it were not measured. Using simulated data we demonstrate that the approach is more accurate than the state of the art, making substantially fewer false discoveries. Finally, we apply the method to a unique set of data collected from 17 individuals with type 1 diabetes mellitus (T1DM) in free-living conditions over 72h where glucose levels, insulin dosing, physical activity and sleep are measured using body-worn sensors. These data often have high rates of error that vary across time, but we are able to uncover the relationships such as that between anaerobic activity and hyperglycemia. Ultimately, better modeling of uncertainty may enable better translation of methods to free-living conditions, as well as better use of noisy and uncertain EHR data.
PMID: 27580935 [PubMed - in process]
Personas in online health communities.
J Biomed Inform. 2016 Oct;63:212-225
Authors: Huh J, Kwon BC, Kim SH, Lee S, Choo J, Kim J, Choi MJ, Yi JS
Many researchers and practitioners use online health communities (OHCs) to influence health behavior and provide patients with social support. One of the biggest challenges in this approach, however, is the rate of attrition. OHCs face similar problems as other social media platforms where user migration happens unless tailored content and appropriate socialization is supported. To provide tailored support for each OHC user, we developed personas in OHCs illustrating users' needs and requirements in OHC use. To develop OHC personas, we first interviewed 16 OHC users and administrators to qualitatively understand varying user needs in OHC. Based on their responses, we developed an online survey to systematically investigate OHC personas. We received 184 survey responses from OHC users, which informed their values and their OHC use patterns. We performed open coding analysis with the interview data and cluster analysis with the survey data and consolidated the analyses of the two datasets. Four personas emerged-Caretakers, Opportunists, Scientists, and Adventurers. The results inform users' interaction behavior and attitude patterns with OHCs. We discuss implications for how these personas inform OHCs in delivering personalized informational and emotional support.
PMID: 27568913 [PubMed - in process]
Comparing efficient data structures to represent geometric models for three-dimensional virtual medical training.
J Biomed Inform. 2016 Oct;63:195-211
Authors: Bíscaro HH, Nunes FL, Dos Santos Oliveira J, Pereira GR
Data structures have been explored for several domains of computer applications in order to ensure efficiency in the data store and retrieval. However, data structures can present different behavior depending on applications that they are being used. Three-dimensional interactive environments offered by techniques of Virtual Reality require operations of loading and manipulating objects in real time, where realism and response time are two important requirements. Efficient representation of geometrical models plays an important part so that the simulation may become real. In this paper, we present the implementation and the comparison of two topologically efficient data structures - Compact Half-Edge and Mate-Face - for the representation of objects for three-dimensional interactive environments. The structures have been tested at different conditions of processors and RAM memories. The results show that both these structures can be used in an efficient manner. Mate-Face structure has shown itself to be more efficient for the manipulation of neighborhood relationships and the Compact Half-Edge was more efficient for loading of the geometric models. We also evaluated the data structures embedded in applications of biopsy simulation using virtual reality, considering a deformation simulation method applied in virtual human organs. The results showed that their use allows the building of applications considering objects with high resolutions (number of vertices), without significant impact in the time spent in the simulation. Therefore, their use contributes for the construction of more realistic simulators.
PMID: 27568296 [PubMed - in process]
Increasing fall risk awareness using wearables: A fall risk awareness protocol.
J Biomed Inform. 2016 Oct;63:184-194
Authors: Danielsen A, Olofsen H, Bremdal BA
Each year about a third of elderly aged 65 or older experience a fall. Many of these falls may have been avoided if fall risk assessment and prevention tools where available in a daily living situation. We identify what kind of information is relevant for doing fall risk assessment and prevention using wearable sensors in a daily living environment by investigating current research, distinguishing between prospective and context-aware fall risk assessment and prevention. Based on our findings, we propose a fall risk awareness protocol as a fall prevention tool integrating both wearables and ambient sensing technology into a single platform.
PMID: 27544413 [PubMed - in process]
Increasing EHR system usability through standards: Conformance criteria in the HL7 EHR-system functional model.
J Biomed Inform. 2016 Oct;63:169-173
Authors: Meehan RA, Mon DT, Kelly KM, Rocca M, Dickinson G, Ritter J, Johnson CM
Though substantial work has been done on the usability of health information technology, improvements in electronic health record system (EHR) usability have been slow, creating frustration, distrust of EHRs and the use of potentially unsafe work-arounds. Usability standards could be part of the solution for improving EHR usability. EHR system functional requirements and standards have been used successfully in the past to specify system behavior, the criteria of which have been gradually implemented in EHR systems through certification programs and other national health IT strategies. Similarly, functional requirements and standards for usability can help address the multitude of sequelae associated with poor usability. This paper describes the evidence-based functional requirements for usability contained in the Health Level Seven (HL7) EHR System Functional Model, and the benefits of open and voluntary EHR system usability standards.
PMID: 27523469 [PubMed - in process]
A model-driven methodology for exploring complex disease comorbidities applied to autism spectrum disorder and inflammatory bowel disease.
J Biomed Inform. 2016 Oct;63:366-378
Authors: Somekh J, Peleg M, Eran A, Koren I, Feiglin A, Demishtein A, Shiloh R, Heiner M, Kong SW, Elazar Z, Kohane I
We propose a model-driven methodology aimed to shed light on complex disorders. Our approach enables exploring shared etiologies of comorbid diseases at the molecular pathway level. The method, Comparative Comorbidities Simulation (CCS), uses stochastic Petri net simulation for examining the phenotypic effects of perturbation of a network known to be involved in comorbidities to predict new roles for mutations in comorbid conditions. To demonstrate the utility of our novel methodology, we investigated the molecular convergence of autism spectrum disorder (ASD) and inflammatory bowel disease (IBD) on the autophagy pathway. In addition to validation by domain experts, we used formal analyses to demonstrate the model's self-consistency. We then used CCS to compare the effects of loss of function (LoF) mutations previously implicated in either ASD or IBD on the autophagy pathway. CCS identified similar dynamic consequences of these mutations in the autophagy pathway. Our method suggests that two LoF mutations previously implicated in IBD may contribute to ASD, and one ASD-implicated LoF mutation may play a role in IBD. Future targeted genomic or functional studies could be designed to directly test these predictions.
PMID: 27522000 [PubMed - in process]
Clinical Trial Information Mediator.
J Biomed Inform. 2016 Oct;63:157-168
Authors: Krauth C, Kuchinke W, Eckert M, Bergmann R, Braasch B, Karakoyun T, Ohmann C
For research in biomedical sciences, cross-domain searches through several different databases are an increasingly necessary task that often becomes a time consuming and labour-intense process. This is especially the case when different domain databases have to be combined, for example combined searches in clinical trials registries, publication databases and research databases. The Clinical Trial Information Mediator (CTIM) addresses this problem and offers a novel way for the combined search in ClinicalTrials.gov, PubMed and BioSamples. CTIM was developed based on a requirements analysis and implemented using open source technology. A search engine with a graphical user interface was developed in order to search linked data in the three databases ClinicalTrials.gov, PubMed and BioSamples; thereby enabling CTIM to bridge the gap between different knowledge domains of clinical trials, publications of research results and biosamples/genetic information. CTIM was applied in three use cases demonstrating that information retrieval could be considerably improved in sense for complex queries. These use cases show that more relevant results were obtained and more associated publications and biosamples could be retrieved in comparison to a separate single search. Main advantages of CTIM are identifying related information between clinical trials and publications employing a clinical trial centred kind of search, simplified access to its databases and thus reduced search time. In addition it can be used by researchers without prior training because of the intuitive usage.
PMID: 27515925 [PubMed - in process]
The quest for engaging AmI: Patient engagement and experience design tools to promote effective assisted living.
J Biomed Inform. 2016 Oct;63:150-156
Authors: Triberti S, Barello S
Recent research highlights that patient engagement, conceived as a patient's behavioral, cognitive and emotional commitment to his own care management, is a key issue while implementing new technologies in the healthcare process. Indeed, eHealth interventions may systematically fail when the patient's subjective experience has not been taken into consideration since the first steps of the technology design. In the present contribution, we argue that such an issue is more and more crucial as regarded to the field of Ambient Intelligence (AmI). Specifically, the exact concept of technologies embedded in the patients' surrounding environment implies a strong impact on their everyday life, which can be perceived as a limitation to autonomy and privacy, and therefore refused or even openly opposed by the final users. The present contribution tackles this issue directly, highlighting: (1) a theoretical framework to include patient engagement in the design of AmI technologies; (2) assessment measures for patient engagement while developing and testing the effectiveness of AmI prototypes for healthcare. Finally (3) this contribution provides an overview of the main issues emerging while implementing AmI technologies and suggests specific design solutions to address them.
PMID: 27515924 [PubMed - in process]
Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository.
J Biomed Inform. 2016 Oct;63:277-294
Authors: Haarbrandt B, Tute E, Marschollek M
BACKGROUND: Detailed Clinical Model (DCM) approaches have recently seen wider adoption. More specifically, openEHR-based application systems are now used in production in several countries, serving diverse fields of application such as health information exchange, clinical registries and electronic medical record systems. However, approaches to efficiently provide openEHR data to researchers for secondary use have not yet been investigated or established.
METHODS: We developed an approach to automatically load openEHR data instances into the open source clinical data warehouse i2b2. We evaluated query capabilities and the performance of this approach in the context of the Hanover Medical School Translational Research Framework (HaMSTR), an openEHR-based data repository.
RESULTS: Automated creation of i2b2 ontologies from archetypes and templates and the integration of openEHR data instances from 903 patients of a paediatric intensive care unit has been achieved. In total, it took an average of ∼2527s to create 2.311.624 facts from 141.917 XML documents. Using the imported data, we conducted sample queries to compare the performance with two openEHR systems and to investigate if this representation of data is feasible to support cohort identification and record level data extraction.
DISCUSSION: We found the automated population of an i2b2 clinical data warehouse to be a feasible approach to make openEHR data instances available for secondary use. Such an approach can facilitate timely provision of clinical data to researchers. It complements analytics based on the Archetype Query Language by allowing querying on both, legacy clinical data sources and openEHR data instances at the same time and by providing an easy-to-use query interface. However, due to different levels of expressiveness in the data models, not all semantics could be preserved during the ETL process.
PMID: 27507090 [PubMed - in process]
Correlation between videogame mechanics and executive functions through EEG analysis.
J Biomed Inform. 2016 Oct;63:131-140
Authors: Mondéjar T, Hervás R, Johnson E, Gutierrez C, Latorre JM
This paper addresses a different point of view of videogames, specifically serious games for health. This paper contributes to that area with a multidisciplinary perspective focus on neurosciences and computation. The experiment population has been pre-adolescents between the ages of 8 and 12 without any cognitive issues. The experiment consisted in users playing videogames as well as performing traditional psychological assessments; during these tasks the frontal brain activity was evaluated. The main goal was to analyse how the frontal lobe of the brain (executive function) works in terms of prominent cognitive skills during five types of game mechanics widely used in commercial videogames. The analysis was made by collecting brain signals during the two phases of the experiment, where the signals were analysed with an electroencephalogram neuroheadset. The validated hypotheses were whether videogames can develop executive functioning and if it was possible to identify which kind of cognitive skills are developed during each kind of typical videogame mechanic. The results contribute to the design of serious games for health purposes on a conceptual level, particularly in support of the diagnosis and treatment of cognitive-related pathologies.
PMID: 27507089 [PubMed - in process]
Analysis of high-order SNP barcodes in mitochondrial D-loop for chronic dialysis susceptibility.
J Biomed Inform. 2016 Oct;63:112-119
Authors: Yang CH, Lin YD, Chuang LY, Chang HW
OBJECTIVES: Positively identifying disease-associated single nucleotide polymorphism (SNP) markers in genome-wide studies entails the complex association analysis of a huge number of SNPs. Such large numbers of SNP barcode (SNP/genotype combinations) continue to pose serious computational challenges, especially for high-dimensional data.
METHODS: We propose a novel exploiting SNP barcode method based on differential evolution, termed IDE (improved differential evolution). IDE uses a "top combination strategy" to improve the ability of differential evolution to explore high-order SNP barcodes in high-dimensional data.
RESULTS: We simulate disease data and use real chronic dialysis data to test four global optimization algorithms. In 48 simulated disease models, we show that IDE outperforms existing global optimization algorithms in terms of exploring ability and power to detect the specific SNP/genotype combinations with a maximum difference between cases and controls. In real data, we show that IDE can be used to evaluate the relative effects of each individual SNP on disease susceptibility.
CONCLUSION: IDE generated significant SNP barcode with less computational complexity than the other algorithms, making IDE ideally suited for analysis of high-order SNP barcodes.
PMID: 27507088 [PubMed - in process]
A vision based proposal for classification of normal and abnormal gait using RGB camera.
J Biomed Inform. 2016 Oct;63:82-89
Authors: Nieto-Hidalgo M, Ferrández-Pastor FJ, Valdivieso-Sarabia RJ, Mora-Pascual J, García-Chamizo JM
Human gait is mainly related to the foot and leg movements but, obviously, the entire motor system of the human body is involved. We hypothesise that movement parameters such as dynamic balance, movement harmony of each body element (arms, head, thorax…) could enable us to finely characterise gait singularities to pinpoint potential diseases or abnormalities in advance. Since this paper deals with the preliminary problem pertaining to the classification of normal and abnormal gait, our study will revolve around the lower part of the body. Our proposal presents a functional specification of gait in which only observational kinematic aspects are discussed. The resultant specification will confidently be open enough to be applied to a variety of gait analysis problems encountered in areas connected to rehabilitation, sports, children's motor skills, and so on. To carry out our functional specification, we develop an extraction system through which we analyse image sequences to identify gait features. Our prototype not only readily lets us determine the dynamic parameters (heel strike, toe off, stride length and time) and some skeleton joints but also satisfactorily supplies us with a proper distinction between normal and abnormal gait. We have performed experiments on a dataset of 30 samples.
PMID: 27498069 [PubMed - in process]
Boosting backpropagation algorithm by stimulus-sampling: Application in computer-aided medical diagnosis.
J Biomed Inform. 2016 Oct;63:74-81
Authors: Gorunescu F, Belciug S
Neural networks (NNs), in general, and multi-layer perceptron (MLP), in particular, represent one of the most efficient classifiers among the machine learning (ML) algorithms. Inspired by the stimulus-sampling paradigm, it is plausible to assume that the association of stimuli with the neurons in the output layer of a MLP can increase its performance. The stimulus-sampling process is assumed memoryless (Markovian), in the sense that the choice of a particular stimulus at a certain step, conditioned by the whole prior evolution of the learning process, depends only on the network's answer at the previous step. This paper proposes a novel learning technique, by enhancing the standard backpropagation algorithm performance with the aid of a stimulus-sampling procedure applied to the output neurons. The network uses the observable behavior that varies throughout the training process by stimulating the correct answers through corresponding rewards/penalties assigned to the output neurons. The proposed model has been applied in computer-aided medical diagnosis using five real-life breast cancer, colon cancer, diabetes, thyroid, and fetal heartbeat databases. The statistical comparison to well-established ML algorithms proved beyond doubt its efficiency and robustness.
PMID: 27498068 [PubMed - in process]
Computing disease incidence, prevalence and comorbidity from electronic medical records.
J Biomed Inform. 2016 Oct;63:108-111
Authors: Bagley SC, Altman RB
Electronic medical records (EMR) represent a convenient source of coded medical data, but disease patterns found in EMRs may be biased when compared to surveys based on sampling. In this communication we draw attention to complications that arise when using EMR data to calculate disease prevalence, incidence, age of onset, and disease comorbidity. We review known solutions to these problems and identify challenges for future work.
PMID: 27498067 [PubMed - in process]
Let's get Physiqual - An intuitive and generic method to combine sensor technology with ecological momentary assessments.
J Biomed Inform. 2016 Oct;63:141-149
Authors: Blaauw FJ, Schenk HM, Jeronimus BF, van der Krieke L, de Jonge P, Aiello M, Emerencia AC
The emergence of wearables and smartwatches is making sensors a ubiquitous technology to measure daily rhythms in physiological measures, such as movement and heart rate. An integration of sensor data from wearables and self-report questionnaire data about cognition, behaviors, and emotions can provide new insights into the interaction of mental and physiological processes in daily life. Hitherto no method existed that enables an easy-to-use integration of sensor and self-report data. To fill this gap, we present 'Physiqual', a platform for researchers that gathers and integrates data from commercially available sensors and service providers into one unified format for use in Ecological Momentary Assessments (EMA) or Experience Sampling Methods (ESM), and Quantified Self (QS). Physiqual currently supports sensor data provided by two well-known service providers and therewith a wide range of smartwatches and wearables. To demonstrate the features of Physiqual, we conducted a case study in which we assessed two subjects by means of data from an EMA study combined with sensor data as aggregated and exported by Physiqual. To the best of our knowledge, the Physiqual platform is the first platform that allows researchers to conveniently aggregate and integrate physiological sensor data with EMA studies.
PMID: 27498066 [PubMed - in process]
Multidisciplinary production of interactive environments to support occupational therapies.
J Biomed Inform. 2016 Oct;63:90-99
Authors: Cardona Reyes H, Muñoz Arteaga J
This work focuses on proposing a multidisciplinary production of interactive environments as a technological support for rehabilitation of people with physical disabilities attending occupational therapy. Nowadays, some technologies and methods are used to develop software in order to assist the people who suffer some kind of physical disability but the physical therapies aren't limited to only one technique of rehabilitation. Current work promotes establish a multidisciplinary team such as therapists and technologists, they can collaborate for the production of interactive environments according the evolution of every patient's rehabilitation. The performance of current proposal is presented throughout a related work and a case study with several usability evaluations.
PMID: 27497781 [PubMed - in process]
An approach for deciphering patient-specific variations with application to breast cancer molecular expression profiles.
J Biomed Inform. 2016 Oct;63:120-130
Authors: Nagarajan R, Upreti M
Several studies have successfully used molecular expression profiling in conjunction with classification techniques for discerning distinct disease groups. However, a majority of these studies do not provide sufficient insights into potential patient-specific variations within the disease groups. Such variations are ubiquitous and manifests across multiple scales with varying resolution. There is an urgent need for novel approaches that falls within the objective of precision medicine and provide novel insights into patient-specific variations and sub-populations within disease groups while discerning the disease groups of interest so as to enable timely and targeted intervention of select subjects. This study presents a selective-voting ensemble classification approach (SVA) for discerning good and poor-prognosis breast cancer samples from their 70-gene molecular expression profile revealing patient-specific variations within the poor-prognosis group. In contrast to traditional classification, SVA adapts the feature sets in a sample-specific manner capturing the proclivity of the samples to each of the disease groups. Correlation between normalized vote counts from SVA and clinical outcomes of the subjects is elucidated. Performance of Support Vector Machine and Naïve Bayes classifier is investigated within the SVA framework and compared to established clinical criteria (Nottingham Prognostic Index, Adjuvant Online, St. Gallen) and Mammaprint approach. Weighted undirected graph abstractions of the ensemble sets of the poor-prognosis test samples is also shown to exhibit markedly different topologies with varying proclivities. These patient-specific networks may reflect inherent variations in underlying signaling mechanisms in the poor-prognosis subjects and reveal potential targets for personalized therapeutic intervention.
PMID: 27477838 [PubMed - in process]
Using concept hierarchies to improve calculation of patient similarity.
J Biomed Inform. 2016 Oct;63:66-73
Authors: Girardi D, Wartner S, Halmerbauer G, Ehrenmüller M, Kosorus H, Dreiseitl S
OBJECTIVE: We introduce a new distance measure that is better suited than traditional methods at detecting similarities in patient records by referring to a concept hierarchy.
MATERIALS AND METHODS: The new distance measure improves on distance measures for categorical values by taking the path distance between concepts in a hierarchy into account. We evaluate and compare the new measure on a data set of 836 patients.
RESULTS: The new measure shows marked improvements over the standard measures, both qualitatively and quantitatively. Using the new measure for clustering patient data reveals structure that is otherwise not visible. Statistical comparisons of distances within patient groups with similar diagnoses shows that the new measure is significantly better at detecting these similarities than the standard measures.
CONCLUSION: The new distance measure is an improvement over the current standard whenever a hierarchical arrangement of categorical values is available.
PMID: 27477837 [PubMed - in process]
Unsupervised detection and analysis of changes in everyday physical activity data.
J Biomed Inform. 2016 Oct;63:54-65
Authors: Sprint G, Cook DJ, Schmitter-Edgecombe M
Sensor-based time series data can be utilized to monitor changes in human behavior as a person makes a significant lifestyle change, such as progress toward a fitness goal. Recently, wearable sensors have increased in popularity as people aspire to be more conscientious of their physical health. Automatically detecting and tracking behavior changes from wearable sensor-collected physical activity data can provide a valuable monitoring and motivating tool. In this paper, we formalize the problem of unsupervised physical activity change detection and address the problem with our Physical Activity Change Detection (PACD) approach. PACD is a framework that detects changes between time periods, determines significance of the detected changes, and analyzes the nature of the changes. We compare the abilities of three change detection algorithms from the literature and one proposed algorithm to capture different types of changes as part of PACD. We illustrate and evaluate PACD on synthetic data and using Fitbit data collected from older adults who participated in a health intervention study. Results indicate PACD detects several changes in both datasets. The proposed change algorithms and analysis methods are useful data mining techniques for unsupervised, window-based change detection with potential to track users' physical activity and motivate progress toward their health goals.
PMID: 27471222 [PubMed - in process]
Health information technology adoption: Understanding research protocols and outcome measurements for IT interventions in health care.
J Biomed Inform. 2016 Oct;63:33-44
Authors: Colicchio TK, Facelli JC, Del Fiol G, Scammon DL, Bowes WA, Narus SP
OBJECTIVE: To classify and characterize the variables commonly used to measure the impact of Information Technology (IT) adoption in health care, as well as settings and IT interventions tested, and to guide future research.
MATERIALS AND METHODS: We conducted a descriptive study screening a sample of 236 studies from a previous systematic review to identify outcome measures used and the availability of data to calculate these measures. We also developed a taxonomy of commonly used measures and explored setting characteristics and IT interventions.
RESULTS: Clinical decision support is the most common intervention tested, primarily in non-hospital-based clinics and large academic hospitals. We identified 15 taxa representing the 79 most commonly used measures. Quality of care was the most common category of these measurements with 62 instances, followed by productivity (11 instances) and patient safety (6 instances). Measures used varied according to type of setting, IT intervention and targeted population.
DISCUSSION: This study provides an inventory and a taxonomy of commonly used measures that will help researchers select measures in future studies as well as identify gaps in their measurement approaches. The classification of the other protocol components such as settings and interventions will also help researchers identify underexplored areas of research on the impact of IT interventions in health care.
CONCLUSION: A more robust and standardized measurement system and more detailed descriptions of interventions and settings are necessary to enable comparison between studies and a better understanding of the impact of IT adoption in health care settings.
PMID: 27450990 [PubMed - in process]
A computational framework for converting textual clinical diagnostic criteria into the quality data model.
J Biomed Inform. 2016 Oct;63:11-21
Authors: Hong N, Li D, Yu Y, Xiu Q, Liu H, Jiang G
BACKGROUND: Constructing standard and computable clinical diagnostic criteria is an important but challenging research field in the clinical informatics community. The Quality Data Model (QDM) is emerging as a promising information model for standardizing clinical diagnostic criteria.
OBJECTIVE: To develop and evaluate automated methods for converting textual clinical diagnostic criteria in a structured format using QDM.
METHODS: We used a clinical Natural Language Processing (NLP) tool known as cTAKES to detect sentences and annotate events in diagnostic criteria. We developed a rule-based approach for assigning the QDM datatype(s) to an individual criterion, whereas we invoked a machine learning algorithm based on the Conditional Random Fields (CRFs) for annotating attributes belonging to each particular QDM datatype. We manually developed an annotated corpus as the gold standard and used standard measures (precision, recall and f-measure) for the performance evaluation.
RESULTS: We harvested 267 individual criteria with the datatypes of Symptom and Laboratory Test from 63 textual diagnostic criteria. We manually annotated attributes and values in 142 individual Laboratory Test criteria. The average performance of our rule-based approach was 0.84 of precision, 0.86 of recall, and 0.85 of f-measure; the performance of CRFs-based classification was 0.95 of precision, 0.88 of recall and 0.91 of f-measure. We also implemented a web-based tool that automatically translates textual Laboratory Test criteria into the QDM XML template format. The results indicated that our approaches leveraging cTAKES and CRFs are effective in facilitating diagnostic criteria annotation and classification.
CONCLUSION: Our NLP-based computational framework is a feasible and useful solution in developing diagnostic criteria representation and computerization.
PMID: 27444185 [PubMed - in process]
User-centered design of multi-gene sequencing panel reports for clinicians.
J Biomed Inform. 2016 Oct;63:1-10
Authors: Cutting E, Banchero M, Beitelshees AL, Cimino JJ, Fiol GD, Gurses AP, Hoffman MA, Jeng LJ, Kawamoto K, Kelemen M, Pincus HA, Shuldiner AR, Williams MS, Pollin TI, Overby CL
The objective of this study was to develop a high-fidelity prototype for delivering multi-gene sequencing panel (GS) reports to clinicians that simulates the user experience of a final application. The delivery and use of GS reports can occur within complex and high-paced healthcare environments. We employ a user-centered software design approach in a focus group setting in order to facilitate gathering rich contextual information from a diverse group of stakeholders potentially impacted by the delivery of GS reports relevant to two precision medicine programs at the University of Maryland Medical Center. Responses from focus group sessions were transcribed, coded and analyzed by two team members. Notification mechanisms and information resources preferred by participants from our first phase of focus groups were incorporated into scenarios and the design of a software prototype for delivering GS reports. The goal of our second phase of focus group, to gain input on the prototype software design, was accomplished through conducting task walkthroughs with GS reporting scenarios. Preferences for notification, content and consultation from genetics specialists appeared to depend upon familiarity with scenarios for ordering and delivering GS reports. Despite familiarity with some aspects of the scenarios we proposed, many of our participants agreed that they would likely seek consultation from a genetics specialist after viewing the test reports. In addition, participants offered design and content recommendations. Findings illustrated a need to support customized notification approaches, user-specific information, and access to genetics specialists with GS reports. These design principles can be incorporated into software applications that deliver GS reports. Our user-centered approach to conduct this assessment and the specific input we received from clinicians may also be relevant to others working on similar projects.
PMID: 27423699 [PubMed - in process]
A cloud-based mobile system to improve respiratory therapy services at home.
J Biomed Inform. 2016 Oct;63:45-53
Authors: Risso NA, Neyem A, Benedetto JI, Carrillo MJ, Farías A, Gajardo MJ, Loyola O
Chronic respiratory diseases are one of the most prevalent health problems in the world. Treatment for these kind of afflictions often take place at home, where the continuous care of a medical specialist is frequently beyond the economical means of the patient, therefore having to rely on informal caregivers (family, friends, etc.). Unfortunately, these treatments require a deep involvement on their part, which results in a heavy burden on the caregivers' routine and usually end up deteriorating their quality of life. In recent years, mHealth and eHealth applications have gained a wide interest in academia due to new capabilities enabled by the latest advancements in mobile technologies and wireless communication infrastructure. These innovations have resulted in several applications that have successfully managed to improve automatic patient monitoring and treatment and to bridge the distance between patients, caregivers and medical specialists. We therefore seek to move this trend forward by now pushing these capabilities into the field of respiratory therapies in order to assist patients with chronic respiratory diseases with their treatment, and to improve both their own and their caregivers' quality of life. This paper presents a cloud-based mobile system to support and improve homecare for respiratory diseases. The platform described uses vital signs monitoring as a way of sharing data between hospitals, caregivers and patients. Using an iterative research approach and the user's direct feedback, we show how mobile technologies can improve a respiratory therapy and a family's quality of life.
PMID: 27392646 [PubMed - in process]
Ambient Intelligence for Health Environments.
J Biomed Inform. 2016 Oct 18;:
Authors: Bravo J, Cook D, Riva G
PMID: 27769889 [PubMed - as supplied by publisher]