top of page
Search

Jaccard coefficient software download: How to calculate similarity between binary data

lauriemonteagudo30


This software is designed to calculate similarity coefficients for different segmentations of the same image, to analyse the performance of segmentation algorithms or human raters. Six different coefficient types can be calculated, as described later, along with segment centroids and the distances between them.




Jaccard coefficient software download



The software was originally produced to calculate similarity coefficients in a study [5] of human rater segmentation of the cross section of the brachial plexus on an ultrasound image (Figure 1). Segments are denoted by the application of colour; the use of different colours allows multiple segments to be defined in each image.


JLAvV designed the study, performed the measurements (observer B), analyzed and interpreted the data, and wrote the manuscript. SL designed the study, interpreted the data, and wrote the manuscript. AG performed the measurements (observer A) and analyzed the data. MK and WJN developed and provided the FatSeg software programme and provided technical advice and support for the measurements and the calculation of the Jaccard similarity coefficients. SPW provided statistical and methodological advice and interpreted the data. JWA Burger provided the clinical data. RWFdB and JNMIJ interpreted the data and supervised the study. All authors critically revized the manuscript and approved the manuscript for publication.


LotuS2 can be accessed either through major software repositories such as (i) Bioconda, (ii) as a Docker image, or (iii) GitHub (accessible through ) (Fig. 1A). The GitHub version comes with an installer script that downloads the required databases and installs and configures LotuS2 with its dependencies. Alternatively, we provide iv) a wrapper for Galaxy [23] allowing installation of LotuS2 on any Galaxy server from the Galaxy ToolShed. LotuS2 is already available to use for free on the UseGalaxy.eu server ( ), where raw reads can be uploaded and analysed (Supplementary Figure S1). While LotuS2 is natively programmed for Unix (Linux, macOS) systems, other operating systems are supported through the Docker image or the Galaxy web interface.


To validate the cluster analysis and genetic structure, the cophenetic correlation coefficient (CCC) value was calculated using UPGMA. The distribution of populations was analyzed using Principal component analysis (PCA) which was carried out using PAST version 2 software (Hammer et al. 2001). The number of significant components to interpret from PCA was determined by both Jolliffe cut-off value and broken stick model (Jolliffe 2002).


We firstly used the data from the Genomics of Drug Sensitivity in Cancer project consisting of 139 drugs and a panel of 790 cancer cell lines (release-5.0). Experimentally determined drug response measurements were determined by log-transformed IC50 values (the concentration of a drug that is required for 50% inhibition in vitro, given as natural log of μM). Notably, a lower value of IC50 indicates a better sensitivity of a cell line to a given drug. In addition, cell lines were characterized by a set of genomic features. We selected the 652 cell lines for which both drug response data and gene expression were available. Furthermore, we focused on the 135 drugs for which SDF format (encoding the chemical structure of the drugs) were available from the NCBI PubChem Repository. Then PubChem fingerprint descriptors were computed using the PaDEL software [19]. The resulting drug response matrix of 135 drugs by 652 cell lines has 88,020 entries, out of which 17,344 (19.70%) are missing and 70,676 are known. For a pair of drugs, the similarity between their fingerprints was measured by the Jaccard coefficient. The cell line similarities, on the other hand, were calculated based on their gene expression profiles, and Pearson correlation coefficient was used to compute the profile similarity between two cell lines.


Since the objective function (5) is not convex with respect to variables U and V, we searched for the local minimum instead of the global minimum by an alternating minimization algorithm. The algorithm which was deduced detailedly in Additional file 1 updates variables U and V alternately. We provided this algorithm in the following, and the software can be freely downloaded from the website ( ). 2ff7e9595c


0 views0 comments

Recent Posts

See All

Comments


bottom of page