Creating a custom GATK Walker (GATK 3.6) : my notebook
YOKOFAKUN
by
2y ago
This is my notebook for creating a custom engine in GATK. Description I want to read a VCF file and to get a table of category/count. Something like this: HAVE_ID TYPE COUNT YES SNP 123 NO SNP 3 NO INDEL 13 Class Category I create a class Category describing each row in the table. It's just a List of Strings static class Category implements Comparable<Category> ..read more
Visit website
Hello WDL ( Workflow Description Language )
YOKOFAKUN
by
2y ago
This is a quick note about my first WDL workflow (Workflow Description Language) https://software.broadinstitute.org/wdl/. As a Makefile, my workflow would be the following one: NAME?=world $(NAME)_sed.txt : $(NAME).txt sed 's/Hello/Goodbye/' $< > $@ $(NAME).txt: echo "Hello $(NAME)" > $@ Executed as:$ make NAME=WORLD echo "Hello WORLD" > WORLD.txt sed 's/Hello/Goodbye/' WORLD.txt > ..read more
Visit website
Writing a Custom ReadFilter for the GATK, my notebook.
YOKOFAKUN
by
2y ago
The GATK contains a set of predefined read filters that "filter or transfer incoming SAM/BAM data files":BadCigar BadMate CountingRead DuplicateRead FailsVendorQualityCheck LibraryRead MalformedRead MappingQuality MappingQualityUnavailable (...) With the help of the modular architecture of the GATK, it's possible to write a custom ReadFilter. In this post I'll write a ReadFilter that removes the ..read more
Visit website
Playing with #magicblast, the #NCBI Short read mapper. My notebook
YOKOFAKUN
by
2y ago
NCBI MAGIC Blast was recently mentioned by BioMickWatson on twitter. Looks pretty cool. Perhaps once again the answer to all bfx questions will be BLAST RE https://t.co/4D5e9QQnrb pic.twitter.com/bwW3y0yl2n- Mick Watson (@BioMickWatson) September 9, 2016 Here, I'll be playing with magicblast and I'll compare its output with bwa (Makefile below). First, here is an extract of the manual for ..read more
Visit website
Pubmed: extracting the 1st authors' gender and location who published in the Bioinformatics journal.
YOKOFAKUN
by
2y ago
In this post I'll get some statistics about the 1st authors in the "Bioinformatics" journal from pubmed. I'll extract their genders and locations. I'll use some tools I've already described some years ago but I've re-written them. Downloading the dataTo download the paper published in Bioinformatics, the pubmed/entrez query is '"Bioinformatics"[jour]'. I use pubmeddump to download all those ..read more
Visit website
Playing with the @ORCID_Org / @ncbi_pubmed graph. My notebook.
YOKOFAKUN
by
2y ago
"ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between you and your professional activities ensuring that your work is recognized. "I've recently discovered that pubmed now integrates ORCID identfiers. and so it begins ! :-D ..read more
Visit website
Finding new intron-exon junctions using the public Encode RNASeq data
YOKOFAKUN
by
2y ago
I've been asked to look for some new / suspected / previously uncharacterized intron-exon junctions in public RNASeq data. I've used the BAMs under http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/. The following command is used to build the list of BAMs: curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeCaltechRnaSeq/" |\ tr ' <>"' "\n ..read more
Visit website
Now in picard: two javascript-based tools filtering BAM and VCF files.
YOKOFAKUN
by
2y ago
SamJS and VCFFilterJS are two tools I wrote for jvarkit. Both tools use the embedded java javascript engine to filter BAM or VCF file. To get a broader audience, I've copied those functionalities to Picard in 'FilterSamReads' and 'FilterVcf'. FilterSamReadsFilterSamReads filters a SAM or BAM file with a javascript expression using the java javascript-engine. The script puts the following ..read more
Visit website
Reading a VCF file faster with java 8, htsjdk and java.util.stream.Stream
YOKOFAKUN
by
2y ago
java 8 streams "support functional-style operations on streams of elements, such as map-reduce transformations on collections". In this post, I will show how I've implemented a java.util.stream.Stream of VCF variants that counts the number of items in dbsnp.This example uses the java htsjdk API for reading variants.When using parallel streams, the main idea is to implement a java.util.Spliterator ..read more
Visit website
Registering a tool in the @ELIXIREurope regisry using XML, XSLT, JSON and curl. My notebook.
YOKOFAKUN
by
2y ago
The Elixir Registry / pmid:26538599 "A portal to bioinformatics resources world-wide. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools."In this post, I will describe how I've used the bio.tools API to register ..read more
Visit website

Follow YOKOFAKUN on FeedSpot

Continue with Google
Continue with Apple
OR