Subscribe: pubmed: "journal of biomedic...
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=0UTf7I--soZUCv_XPWmo6aGFh3ZP0jdNleq-cqg6_6w
Preview: pubmed: "journal of biomedic...

pubmed: "journal of biomedic...



NCBI: db=pubmed; Term=("journal of biomedical informatics"[Jour])



 



Measuring Content Overlap during Handoff Communication using Distributional Semantics: An Exploratory Study.

Measuring Content Overlap during Handoff Communication using Distributional Semantics: An Exploratory Study.

J Biomed Inform. 2016 Nov 29;:

Authors: Abraham J, Kannampallil T, Srinivasan V, Galanter W, Tagney G, Cohen T

Abstract
OBJECTIVE: We develop and evaluate a methodological approach to measure the degree and nature of overlap in handoff communication content within and across clinical professions. This extensible, exploratory approach relies on combining techniques from conversational analysis and distributional semantics.
MATERIALS AND METHODS: We audio-recorded handoff communication of residents and nurses on the General Medicine floor of a large academic hospital (n=120 resident and n=120 nurse handoffs). We measured semantic similarity, a proxy for content overlap, between resident-resident and nurse-nurse communication using multiple steps: a qualitative conversational content analysis; an automated semantic similarity analysis using Reflective Random Indexing (RRI); and comparing semantic similarity generated by RRI analysis with human ratings of semantic similarity.
RESULTS: There was significant association between the semantic similarity as computed by the RRI method and human rating (ρ=0.88). Based on the semantic similarity scores, content overlap was relatively higher for content related to patient active problems, assessment of active problems, patient-identifying information, past medical history, and medications/treatments. In contrast, content overlap was limited on content related to allergies, family-related information, code status, and anticipatory guidance.
CONCLUSIONS: Our approach using RRI analysis provides new opportunities for characterizing the nature and degree of overlap in handoff communication. Although exploratory, this method provides a basis for identifying content that can be used for determining shared understanding across clinical professions. Additionally, this approach can inform the development of flexibly standardized handoff tools that reflect clinical content that are most appropriate for fostering shared understanding during transitions of care.

PMID: 27913246 [PubMed - as supplied by publisher]




Contextual Computing: A Bluetooth Based Approach for Tracking Healthcare Providers in the Emergency Room.

Contextual Computing: A Bluetooth Based Approach for Tracking Healthcare Providers in the Emergency Room.

J Biomed Inform. 2016 Nov 29;:

Authors: Frisby J, Smith V, Traub S, Patel VL

Abstract
Hospital Emergency Departments (EDs) frequently experience crowding. One of the factors that contributes to this crowding is the "door to doctor time", which is the time from a patient's registration to when the patient is first seen by a physician. This is also one of the Meaningful Use (MU) performance measures that emergency departments report to the Center for Medicare and Medicaid Services (CMS). Current documentation methods for this measure are inaccurate due to the imprecision in manual data collection. We describe a method for automatically (in real time) and more accurately documenting the door to physician time. Using sensor-based technology, the distance between the physician and the computer is calculated by using the single board computers installed in patient rooms that log each time a Bluetooth signal is seen from a device that the physicians carry. This distance is compared automatically with the accepted room radius to determine if the physicians are present in the room at the time logged to provide greater precision. The logged times, accurate to the second, were compared with physicians' handwritten times, showing automatic recordings to be more precise. This real time automatic method will free the physician from extra cognitive load of manually recording data. This method for evaluation of performance is generic and can be used in any other setting outside the ED, and for purposes other than measuring physician time.

PMID: 27913245 [PubMed - as supplied by publisher]




Pilot Evaluation of a Method to Assess Prescribers' Information Processing of Medication Alerts.
Related Articles

Pilot Evaluation of a Method to Assess Prescribers' Information Processing of Medication Alerts.

J Biomed Inform. 2016 Nov 28;:

Authors: Russ AL, Melton BL, Daggy JK, Saleem JJ

Abstract
BACKGROUND: Prescribers commonly receive alerts during medication ordering. Prescribers work in a complex, time-pressured environment; to enhance the effectiveness of safety alerts, the effort needed to cognitively process these alerts should be minimized. Methods to evaluate the extent to which computerized alerts support prescribers' information processing are lacking.
OBJECTIVE: To develop a methodological protocol to assess the extent to which alerts support prescribers' information processing at-a-glance; specifically, the incorporation of information into working memory. We hypothesized that the method would be feasible and that we would be able to detect a significant difference in prescribers' information processing with a revised alert display that incorporates warning design guidelines compared to the original alert display.
METHODS: A counterbalanced, within-subject study was conducted with 20 prescribers in a human-computer interaction laboratory. We tested a single alert that was displayed in two different ways. Prescribers were informed that an alert would appear for 10 sec. After the alert was shown, a white screen was displayed, and prescribers were asked to verbally describe what they saw; indicate how many total warnings; and describe anything else they remembered about the alert. We measured information processing via the accuracy of prescribers' free recall and their ability to identify that three warning messages were present. Two analysts independently evaluated participants' responses against a comprehensive catalog of alert elements and then discussed discrepancies until reaching consensus.
RESULTS: This feasibility study demonstrated that the method seemed to be effective for evaluating prescribers' information processing of medication alert displays. With this method, we were able to detect significant differences in prescribers' recall of alert information. The proportion of total data elements that prescribers were able to accurately recall was significantly greater for the revised versus original alert display (p = .006). With the revised display, more prescribers accurately reported that three warnings were shown (p = .002).
CONCLUSIONS: The methodological protocol was feasible for evaluating the alert display and yielded important findings on prescribers' information processing. Study methods supplement traditional usability evaluation methods and may be useful for evaluating information processing of other healthcare technologies.

PMID: 27908833 [PubMed - as supplied by publisher]




The virtual dissecting room: creating highly detailed anatomy models for educational purposes.
(image) Related Articles

The virtual dissecting room: creating highly detailed anatomy models for educational purposes.

J Biomed Inform. 2016 Nov 21;:

Authors: Zilverschoon M, Vincken KL, Bleys RL

Abstract
INTRODUCTION: Virtual 3D models are powerful tools for teaching anatomy. At the present day, there are a lot of different digital anatomy models, most of these commercial applications are based on a 3D model of a human body reconstructed from images with a 1 millimeter intervals. The use of even smaller intervals may result in more details and more realistic appearances of 3D anatomy models. The aim of this study was to create a realistic and highly detailed 3D model of the hand and wrist based on small interval cross-sectional images, suitable for undergraduate and postgraduate teaching purposes with the possibility to perform a virtual dissection in an educational application.
METHODS: In 115 transverse cross-sections from a human hand and wrist, segmentation was done by manually delineating 90 different structures. With the use of Amira the segments were imported and a surface model/polygon model was created, followed by smoothening of the surfaces in Mudbox. In 3D Coat software the smoothed polygon models were automatically retopologied into a quadrilaterals formation and a UV map was added. In Mudbox, the textures from 90 structures were depicted in a realistic way by using photos from real tissue and afterwards height maps, gloss and specular maps were created to add more level of detail and realistic lightning on every structure. Unity was used to build a new software program that would support all the extra map features together with a preferred user interface.
CONCLUSION: A 3D hand model has been created, containing 100 structures (90 at start and 10 extra structures added along the way). The model can be used interactively by changing the transparency, manipulating single or grouped structures and thereby simulating a virtual dissection. This model can be used for a variety of teaching purposes, ranging from undergraduate medical students to residents of hand surgery. Studying the hand and wrist anatomy using this model is cost-effective and not hampered by the limited access to real dissecting facilities.

PMID: 27884788 [PubMed - as supplied by publisher]




Prediction of microRNAs involved in immune system diseases through network based features.
(image) Related Articles

Prediction of microRNAs involved in immune system diseases through network based features.

J Biomed Inform. 2016 Nov 15;:

Authors: Prabahar A, Natarajan J

Abstract
MicroRNAs are a class of small non-coding regulatory RNA molecules that modulate the expression of several genes at post-transcriptional level and play a vital role in disease pathogenesis. Recent research shows that a range of miRNAs are involved in the regulation of immunity and its deregulation results in immune mediated diseases such as cancer, inflammation and autoimmune diseases. Computational discovery of these immune miRNAs using a set of specific features is highly desirable. In the current investigation, we present a SVM based classification system which uses a set of novel network based topological and motif features in addition to the baseline sequential and structural features to predict immune specific miRNAs from other non-immune miRNAs. The classifier was trained and tested on a balanced set of equal number of positive and negative examples to show the discriminative power of our network features. Experimental results show that our approach achieves an accuracy of 90.2% and outperforms the classification accuracy of 63.2% reported using the traditional miRNA sequential and structural features. The proposed classifier was further validated with two immune disease sub-class datasets related to multiple sclerosis microarray data and psoriasis RNA-seq data with higher accuracy. The results indicate that our classifier which uses network and motif features along with sequential and structural features will lead to significant improvement in classifying immune miRNAs and hence can be applied to identify other specific classes of miRNAs as an extensible miRNA classification system.

PMID: 27871823 [PubMed - as supplied by publisher]




Generating disease-pertinent treatment vocabularies from MEDLINE citations.
(image) Related Articles

Generating disease-pertinent treatment vocabularies from MEDLINE citations.

J Biomed Inform. 2016 Nov 16;:

Authors: Wang L, Del Fiol G, Bray BE, Haug PJ

Abstract
OBJECTIVE: Healthcare communities have identified a significant need for disease-specific information. Disease-specific ontologies are useful in assisting the retrieval of disease-relevant information from various sources. However, building these ontologies is labor intensive. Our goal is to develop a system for an automated generation of disease-pertinent concepts from a popular knowledge resource for the building of disease-specific ontologies.
METHODS: A pipeline system was developed with an initial focus of generating disease-specific treatment vocabularies. It was comprised of the components of disease-specific citation retrieval, predication extraction, treatment predication extraction, treatment concept extraction, and relevance ranking. A semantic schema was developed to support the extraction of treatment predications and concepts. Four ranking approaches (i.e., occurrence, interest, degree centrality, and weighted degree centrality) were proposed to measure the relevance of treatment concepts to the disease of interest. We measured the performance of four ranks in terms of the mean precision at the top 100 concepts with five diseases, as well as the precision-recall curves against two reference vocabularies. The performance of the system was also compared to two baseline approaches.
RESULTS: The pipeline system achieved a mean precision of 0.80 for the top 100 concepts with the ranking by interest. There were no significant different among the four ranks (p=0.53). However, the pipeline-based system had significantly better performance than the two baselines.
CONCLUSIONS: The pipeline system can be useful for an automated generation of disease-relevant treatment concepts from the biomedical literature.

PMID: 27866001 [PubMed - as supplied by publisher]




Templates as a method for implementing data provenance in decision support systems.
(image) (image) Related Articles

Templates as a method for implementing data provenance in decision support systems.

J Biomed Inform. 2016 Nov 14;:

Authors: Curcin V, Fairweather E, Danger R, Corrigan D

Abstract
Decision support systems are used as a method of promoting consistent guideline-based diagnosis supporting clinical reasoning at point of care. However, despite the availability of numerous commercial products, the wider acceptance of these systems has been hampered by concerns about diagnostic performance and a perceived lack of transparency in the process of generating clinical recommendations. This resonates with the Learning Health System paradigm that promotes data-driven medicine relying on routine data capture and transformation, which also stresses the need for trust in an evidence-based system. Data provenance is a way of automatically capturing the trace of a research task and its resulting data, thereby facilitating trust and the principles of reproducible research. While computational domains have started to embrace this technology through provenance-enabled execution middlewares, traditionally non-computational disciplines, such as medical research, that do not rely on a single software platform, are still struggling with its adoption. In order to address these issues, we introduce provenance templates - abstract provenance fragments representing meaningful domain actions. Templates can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. This paper specifies the requirements for a Decision Support tool based on the Learning Health System, introduces the theoretical model for provenance templates and demonstrates the resulting architecture. Our methods were tested and validated on the provenance infrastructure for a Diagnostic Decision Support System that was developed as part of the EU FP7 TRANSFoRm project.

PMID: 27856379 [PubMed - as supplied by publisher]




Methodological Framework for Evaluating Clinical Processes: A Cognitive Informatics Perspective.
(image) Related Articles

Methodological Framework for Evaluating Clinical Processes: A Cognitive Informatics Perspective.

J Biomed Inform. 2016 Nov 12;:

Authors: Kannampallil T, Abraham J, Patel VL

Abstract
We propose a methodological framework for evaluating clinical cognitive activities in complex real-world environments that provides a guiding framework for characterizing the patterns of activities. This approach, which we refer to as a process-based approach, is particularly relevant to cognitive informatics (CI) research-an interdisciplinary domain utilizing cognitive approaches in the study of computing systems and applications-as it provides new ways for understanding human information processing, interactions, and behaviors. Using this approach involves the identification of a process of interest (e.g., a clinical workflow), and the contributing sequences of activities in that process (e.g., medication ordering). A variety of analytical approaches can then be used to characterize the inherent dependencies and relations within the contributing activities within the considered process. Using examples drawn from our own research and the extant research literature, we describe the theoretical foundations of the process-based approach, relevant practical and pragmatic considerations for using such an approach, and a generic framework for applying this approach for evaluation studies in clinical settings. We also discuss the potential for this approach in future evaluations of interactive clinical systems, given the need for new approaches for evaluation, and significant opportunities for automated, unobtrusive data collection.

PMID: 27847328 [PubMed - as supplied by publisher]




Anonymizing datasets with demographics and diagnosis codes in the presence of utility constraints.
(image) Related Articles

Anonymizing datasets with demographics and diagnosis codes in the presence of utility constraints.

J Biomed Inform. 2016 Nov 07;:

Authors: Poulis G, Loukides G, Skiadopoulos S, Gkoulalas-Divanis A

Abstract
Publishing data about patients that contain both demographics and diagnosis codes is essential to perform large-scale, low-cost medical studies. However, preserving the privacy and utility of such data is challenging, because it requires: (i) guarding against identity disclosure (re-identification) attacks based on both demographics and diagnosis codes, (ii) ensuring that the anonymized data remain useful in intended analysis tasks, and (iii) minimizing the information loss, incurred by anonymization, to preserve the utility of general analysis tasks that are difficult to determine before data publishing. Existing anonymization approaches are not suitable for being used in this setting, because they cannot satisfy all three requirements. Therefore, in this work, we propose a new approach to deal with this problem. We enforce the requirement (i) by applying (k,k(m))-anonymity, a privacy principle that prevents re-identification from attackers who know the demographics of a patient and up to m of their diagnosis codes, where k and m are tunable parameters. To capture the requirement (ii), we propose the concept of utility constraint for both demographics and diagnosis codes. Utility constraints limit the amount of generalization and are specified by data owners (e.g., the healthcare institution that performs anonymization). We also capture requirement (iii), by employing well-established information loss measures for demographics and for diagnosis codes. To realize our approach, we develop an algorithm that enforces (k,k(m))-anonymity on a dataset containing both demographics and diagnosis codes, in a way that satisfies the specified utility constraints and with minimal information loss, according to the measures. Our experiments with a large dataset containing more than 200,000 electronic health records show the effectiveness and efficiency of our algorithm.

PMID: 27832965 [PubMed - as supplied by publisher]




Prognostics of Surgical Site Infections using Dynamic Health Data.
(image) Related Articles

Prognostics of Surgical Site Infections using Dynamic Health Data.

J Biomed Inform. 2016 Nov 04;:

Authors: Ke C, Jin Y, Evans H, Lober B, Qian X, Liu J, Huang S

Abstract
Surgical Site Infection (SSI) is a national priority in healthcare research. Much research attention has been attracted to develop better SSI risk prediction models. However, most of the existing SSI risk prediction models are built on static risk factors such as comorbidities and operative factors. In this paper, we investigate the use of the dynamic wound data for SSI risk prediction. There have been emerging mobile health (mHealth) tools that can closely monitor the patients and generate continuous measurements of many wound-related variables and other evolving clinical variables. Since existing prediction models of SSI have quite limited capacity to utilize the evolving clinical data, we develop the corresponding solution to equip these mHealth tools with decision-making capabilities for SSI prediction with a seamless assembly of several machine learning models to tackle the analytic challenges arising from the spatial-temporal data. The basic idea is to exploit the low-rank property of the spatial-temporal data via the bilinear formulation, and further enhance it with automatic missing data imputation by the matrix completion technique. We derive efficient optimization algorithms to implement these models and demonstrate the superior performances of our new predictive model on a real-world dataset of SSI, compared to a range of state-of-the-art methods.

PMID: 27825798 [PubMed - as supplied by publisher]




Identifying complexity in infectious diseases inpatient settings: An observation study.
(image) Related Articles

Identifying complexity in infectious diseases inpatient settings: An observation study.

J Biomed Inform. 2016 Nov 03;:

Authors: Roosan D, Weir C, Samore M, Jones M, Rahman M, Stoddard GJ, Del Fiol G

Abstract
BACKGROUND: Understanding complexity in healthcare has the potential to reduce decision and treatment uncertainty. Therefore, identifying both patient and task complexity may offer better task allocation and design recommendation for next-generation health information technology system design.
OBJECTIVE: To identify specific complexity-contributing factors in the infectious disease domain and the relationship with the complexity perceived by clinicians.
METHOD: We observed and audio recorded clinical rounds of three infectious disease teams. Thirty cases were observed for a period of four consecutive days. Transcripts were coded based on clinical complexity-contributing factors from the clinical complexity model. Ratings of complexity on day 1 for each case were collected. We then used statistical methods to identify complexity-contributing factors in relationship to perceived complexity of clinicians.
RESULTS: A factor analysis (principal component extraction with varimax rotation) of specific items revealed three factors (eigenvalues>2.0) explaining 47% of total variance, namely task interaction and goals (10 items, 26%, Cronbach's Alpha=0.87), urgency and acuity (6 items, 11%, Cronbach's Alpha=0.67), and psychosocial behavior (4 items, 10%, Cronbach's alpha=0.55). A linear regression analysis showed no statistically significant association between complexity perceived by the physicians and objective complexity, which was measured from coded transcripts by three clinicians (Multiple R-squared=0.13, p=0.61). There were no physician effects on the rating of perceived complexity.
CONCLUSION: Task complexity contributes significantly to overall complexity in the infectious diseases domain. The different complexity-contributing factors found in this study can guide health information technology system designers and researchers for intuitive design. Thus, decision support tools can help reduce the specific complexity-contributing factors. Future studies aimed at understanding clinical domain-specific complexity-contributing factors can ultimately improve task allocation and design for intuitive clinical reasoning.

PMID: 27818310 [PubMed - as supplied by publisher]




Trends on the Application of Serious Games to Neuropsychological Evaluation: A Scoping Review.
Related Articles Trends on the Application of Serious Games to Neuropsychological Evaluation: A Scoping Review. J Biomed Inform. 2016 Nov 01;: Authors: Valladares-Rodríguez S, Pérez-Rodríguez R, Anido-Rifón L, Fernández-Iglesias M Abstract BACKGROUND: The dramatic technological advances witnessed in recent years have resulted in a great opportunity for changing the way neuropsychological evaluations may be performed in clinical practice. Particularly, serious games have been posed as the cornerstone of this still incipient paradigm-shift, as they have characteristics that make them especially advantageous in trying to overcome limitations associated with traditional pen-and-paper based neuropsychological tests: they can be easily administered and they can feature complex environments for the evaluation of neuropsychological constructs that are difficult to evaluate through traditional tests. The objective of this study was to conduct a scoping literature review in order to map rapidly the key concepts underpinning this research area during the last 25 years on the use of serious games for neuropsychological evaluation. METHODS: MEDLINE, PsycINFO, Scopus and IEEE Xplore databases were systematically searched. The main eligibility criteria were to select studies published in a peer-reviewed journal; written in English; published in the last 25 years; focused on the human population, and classified in the neuropsychological field. Moreover, to avoid risk of bias, studies were selected by consensus of experts, focusing primarily in psychometric properties. Therefore, selected studies were analyzed in accordance with a set of dimensions of analysis commonly used for evaluating neuropsychological tests. RESULTS: After applying the selected search strategy, 57 studies -including 54 serious games- met our selection criteria. The selected studies deal with visuospatial capabilities, memory, attention, executive functions, and complex neuropsychological constructs such as Mild Cognitive Impairment (MCI). Results show that the implementation of serious games for neuropsychological evaluation is tackled in several different ways in the selected studies, and that studies have so far been mainly exploratory, just aiming at testing the feasibility of the proposed approaches. DISCUSSION: It may be argued that the limited number of databases used might compromise this study. However, we think that the finally included sample is representative, in spite of how difficult is to achieve an optimum and maximum scope. Indeed, this review identifies other research issues related to the development of serious games beyond their reliability and validity. The main conclusion of this review is that there is a great interest in the research community in the use of serious games for neuropsychological evaluation. This scoping review is pertinent, in accordance with the increasing number of studies published in the last three years, they demonstrate its potential as a serious alternative to classic neuropsychological tests. Nevertheless, more research is needed in order to implement serious games that are reliable, valid, and ready to be used in the everyday clinical practice. PMID: 27815228 [PubMed - as supplied by publisher] [...]



Can Multilinguality Improve Biomedical Word Sense Disambiguation?
(image) Related Articles

Can Multilinguality Improve Biomedical Word Sense Disambiguation?

J Biomed Inform. 2016 Nov 01;:

Authors: Duque A, Martinez-Romo J, Araujo L

Abstract
Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyse the best ways to generate those useful multilingual resources, and study different languages and sources of knowledge. The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small number of documents. Also, empirical results show that automatically translated resources are a useful source of information for this particular task.

PMID: 27815227 [PubMed - as supplied by publisher]




Evaluating Semantic Similarity between Chinese Biomedical Terms through Multiple Ontologies with Score Normalization: An Initial Study.
(image) Related Articles

Evaluating Semantic Similarity between Chinese Biomedical Terms through Multiple Ontologies with Score Normalization: An Initial Study.

J Biomed Inform. 2016 Oct 31;:

Authors: Ning W, Yu M, Kong D

Abstract
BACKGROUND: Semantic similarity estimation significantly promotes the understanding of natural language resources and supports medical decision making. Previous studies have investigated semantic similarity and relatedness estimation between biomedical terms through resources in English, such as SNOMED-CT or UMLS. However, very limited studies focused on the Chinese language, and technology on natural language processing and text mining of medical documents in China is urgently needed. Due to the lack of a complete and publicly available biomedical ontology in China, we only have access to several modest-sized ontologies with no overlaps. Although all these ontologies do not constitute a complete coverage of biomedicine, their coverage of their respective domains is acceptable. In this paper, semantic similarity estimations between Chinese biomedical terms using these multiple non-overlapping ontologies were explored as an initial study.
METHODS: Typical path-based and information content (IC)-based similarity measures were applied on these ontologies. From the analysis of the computed similarity scores, heterogeneity in the statistical distributions of scores derived from multiple ontologies was discovered. This heterogeneity hampers the comparability of scores and the overall accuracy of similarity estimation. This problem was addressed through a novel language-independent method by combining semantic similarity estimation and score normalization. A reference standard was also created in this study.
RESULTS: Compared with the existing task-independent normalization methods, the newly developed method exhibited superior performance on most IC-based similarity measures. The accuracy of semantic similarity estimation was enhanced through score normalization. This enhancement resulted from the mitigation of heterogeneity in the similarity scores derived from multiple ontologies.
CONCLUSION: We demonstrated the potential necessity of score normalization when estimating semantic similarity using ontology-based measures. The results of this study can also be extended to other language systems to implement semantic similarity estimation in biomedicine.

PMID: 27810481 [PubMed - as supplied by publisher]




Evaluation of Relational and NoSQL Database Architectures to Manage Genomic Annotations.
(image) Related Articles

Evaluation of Relational and NoSQL Database Architectures to Manage Genomic Annotations.

J Biomed Inform. 2016 Oct 31;:

Authors: Schulz WL, Nelson BG, Felker DK, Durant TJ, Torres R

Abstract
While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences.

PMID: 27810480 [PubMed - as supplied by publisher]




Sensitivity analysis of gene ranking methods in phenotype prediction.
Related Articles Sensitivity analysis of gene ranking methods in phenotype prediction. J Biomed Inform. 2016 Oct 25;: Authors: deAndrés-Galiana EJ, Fernández-Martínez JL, Sonis ST Abstract INTRODUCTION: It has become clear that noise generated during the assay and analytical processes has the ability to disrupt accurate interpretation of genomic studies. Not only does such noise impact the scientific validity and costs of studies, but when assessed in the context of clinically translatable indications such as phenotype prediction, it can lead to inaccurate conclusions that could ultimately impact patients. We applied a sequence of ranking methods to damp noise associated with microarray outputs, and then tested the utility of the approach in three disease indications using publically available datasets. MATERIALS AND METHODS: This study was performed in three phases. We first theoretically analyzed the effect of noise in phenotype prediction problems showing that it can be expressed as a modeling error that partially falsifies the pathways. Secondly, via synthetic modeling, we performed the sensitivity analysis for the main gene ranking methods to different types of noise. Finally, we studied the predictive accuracy of the gene lists provided by these ranking methods in synthetic data and in three different datasets related to cancer, rare and neurodegenerative diseases to better understand the translational aspects of our findings. RESULTS AND DISCUSSION: In the case of synthetic modeling, we showed that Fisher's Ratio (FR) was the most robust gene ranking method in terms of precision for all the types of noise at different levels. Significance Analysis of Microarrays (SAM) provided slightly lower performance and the rest of the methods (fold change, entropy and maximum percentile distance) were much less precise and accurate. The predictive accuracy of the smallest set of high discriminatory probes was similar for all the methods in the case of Gaussian and Log-Gaussian noise. In the case of class assignment noise, the predictive accuracy of SAM and FR is higher. Finally, for real datasets (Chronic Lymphocytic Leukemia, Inclusion Body Myositis and Amyotrophic Lateral Sclerosis) we found that FR and SAM provided the highest predictive accuracies with the smallest number of genes. Biological pathways were found with an expanded list of genes whose discriminatory power has been established via FR. CONCLUSIONS: We have shown that noise in expression data and class assignment partially falsifies the sets of discriminatory probes in phenotype prediction problems. FR and SAM better exploit the principle of parsimony and are able to find subsets with less number of high discriminatory genes. The predictive accuracy and the precision are two different metrics to select the important genes, since in the presence of noise the most predictive genes do not completely coincide with those that are related to the phenotype. Based on the synthetic results, FR and SAM are recommended to unravel the biological pathways that are involved in the disease development. PMID: 27793724 [PubMed - as supplied by publisher] [...]



Identifying Impact of Software Dependencies on Replicability of Biomedical Workflows.
(image) Related Articles

Identifying Impact of Software Dependencies on Replicability of Biomedical Workflows.

J Biomed Inform. 2016 Oct 24;:

Authors: Miksa T, Rauber A, Mina E

Abstract
Complex data driven experiments form the basis of biomedical research. Recent findings warn that the context in which the software is run, that is the infrastructure and the third party dependencies, can have a crucial impact on the final results delivered by a computational experiment. This implies that in order to replicate the same result, not only the same data must be used, but also it must be run on an equivalent software stack. In this paper we present the VFramework that enables assessing replicability of workflows. It identifies whether any differences in software dependencies among two executions of the same workflow exist and whether they have impact on the produced results. We also conduct a case study in which we investigate the impact of software dependencies on replicability of Taverna workflows used in biomedical research of Huntington's disease. We re-execute analysed workflows in environments differing in operating system distribution and configuration. The results show that the VFramework can be used to identify the impact of software dependencies on the replicability of biomedical workflows. Furthermore, we observe that despite the fact that the workflows are executed in a controlled environment, they still depend on specific tools installed in the environment. The context model used by the VFramework improves the deficiencies of provenance traces and documents also such tools. Based on our findings we define guidelines for workflow owners that enable them to improve replicability of their workflows.

PMID: 27789415 [PubMed - as supplied by publisher]