
Diving into Genetics and Genomics
647 FOLLOWERS
A wet biologist's bioinformatic notes. Mostly is about Linux, R, python, reproducible research, open science and NGS. I am into data science! I am working on glioblastoma (a terrible brain cancer) genomics at MD Anderson cancer center. Disclaimer: For posts that I copied from other places, credits go to the original authors. Follow the links to the original posts, I mainly put them here..
Diving into Genetics and Genomics
2y ago
1/ Several basic commands will serve you a long way:
git clone
git add
git commit -m
git push
Those are enough to get you started. To be honest, those are still the most frequent commands I use.
2/ understand git and github. You use git to track files locally, and github can host your repos. You can start with the github skill page https://buff.ly/3tO2iaf
gitlab https://buff.ly/3JlGA69 is an alternative to github
3/ software carpentry git workshop is a nice resource to learn git https://buff.ly/3kUhqB7
4/ An open source game about learning Git! https://buff.ly/2ZPXUrX
5/ Learn it for fre ..read more
Diving into Genetics and Genomics
2y ago
1/ [An Empirical Bayes Method for Differential Expression Analysis of Single Cells with Deep Generative Models](https://www.biorxiv.org/content/10.1101/2022.05.27.493625v1) scVI-DE
2/ [muscat](http://www.bioconductor.org/packages/release/bioc/html/muscat.html)
3/ [Confronting false discoveries in single-cell differential expression](https://www.nature.com/articles/s41467-021-25960-2) "These observations suggest that, in practice, pseudobulk approaches provide an excellent trade-off between speed and accuracy for single-cell DE analysis." One needs to considder biolgoica ..read more
Diving into Genetics and Genomics
2y ago
1/ Tips for negotiating salary and startup for newly-hired tenure-track faculty](https://dynamicecology.wordpress.com/2017/03/01/tips-for-negotiating-salary-and-startup-for-newly-hired-tenure-track-faculty/)
2/ [Creating accessibility in academic negotiations](https://www.sciencedirect.com/science/article/pii/S0968000422002870?dgcid=authord)
3/ [Ten Simple Rules to becoming a principal investigator](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007448)
4/ [applying for a faculty position](http://effortreport.libsyn.com/15-applying-for-a-faculty-position) b ..read more
Diving into Genetics and Genomics
2y ago
1/ [immunarch](https://immunarch.com/index.html)
2/ [scRepertoire](https://github.com/ncborcherding/scRepertoire)
3/ [dandelion](https://sc-dandelion.readthedocs.io/en/latest/) python package for analyzing single cell BCR/TCR data from 10x Genomics 5’ solution!
4/ [TRUST4](https://www.nature.com/articles/s41592-021-01142-2) developed in Shirley Liu's group. Use it to extract TCR/BCR information from bulk RNAseq or 5' scRNAseq data.
5/ a dramatic speedup for one of the core computations for adaptive immune receptor repertoire (AIRR) analysis - the discovery a ..read more
Diving into Genetics and Genomics
2y ago
Making a heatmap is an essential skill for a bioinformatician. Just check how many figures are heatmap or heatmap variants in the genomics or single cell paper.
But you probably do not understand heatmap. 7 reading resources to understand heatmap!
1/ Mapping quantitative data to color https://www.nature.com/articles/nmeth.2134
2/ Heat map from Nature Method column https://www.nature.com/articles/nmeth.1902
3/ A tale of two heatmap functions https://rpubs.com/crazyhottommy/a-tale-of-two-heatmap-functions An old post by me.
4/ Heatmap demystified https ..read more
Diving into Genetics and Genomics
2y ago
* Best Practices for Biomedical Research Data Management https://learn.canvas.net/courses/1854
* Research Data Management Librarian Academy (https://rdmla.github.io/)
* DataONE Data Management Skillbuilding Hub (https://dataoneorg.github.io/Education)
* Data Management Training Clearinghouse (https://dmtclearinghouse.esipfed.org/)
* Research data management open training materials Zenodo Community (https://zenodo.org/communities/dcc-rdm-training-materials)
* Consortium of European Social Science Data Archives (CESSDA) Training Resources (https://www.cessda.eu/Training-Resources)
Bonus ..read more
Diving into Genetics and Genomics
2y ago
R packages:
* [readxl](https://readxl.tidyverse.org/)
* [tidyxl](https://github.com/nacnudus/tidyxl)
* [janitor](https://github.com/sfirke/janitor)
command line tools:
* [VisiData](https://www.visidata.org/) is an interactive multitool for tabular data. It combines the clarity of a spreadsheet, the efficiency of the terminal, and the power of Python, into a lightweight utility which can handle millions of rows with ease.
* [csvkit](https://csvkit.readthedocs.io/en/latest/index.html#)
* [csvtk](https://bioinf.shenwei.me/csvtk/usage/) a cross-platform, efficient and practical CSV/TSV tool ..read more
Diving into Genetics and Genomics
2y ago
1. ENCODE https://www.encodeproject.org/
2. The International Human Epigenome Consortium (IHEC) epigenome data portal http://epigenomesportal.ca/ihec/index.html?as=1
3. Blueprint epigenome http://dcc.blueprint-epigenome.eu/#/home
4. EpiFactors http://epifactors.autosome.ru/ is a database for epigenetic factors, corresponding genes and products.
5. CistromeDB http://cistrome.org/db/#/ by Shirley Liu group
6. Remap https://remap2022.univ-amu.fr/ is a large scale integrative analysis of DNA-binding experiments for Homo sapiens, Mus musculus, Drosophila melanogaster and Arabidopsis thaliana ..read more
Diving into Genetics and Genomics
2y ago
1. Data science: A first introduction https://datasciencebook.ca/
2. Introduction to Data Science http://rafalab.dfci.harvard.edu/dsbook/
3. Agile Data Science with R https://edwinth.github.io/ADSwR/index.html
4. Tidy Modeling with R https://www.tmwr.org/
5. Feature Engineering and Selection: A Practical Approach for Predictive Models https://bookdown.org/max/FES/
6. Another Book on Data Science https://www.anotherbookondatascience.com/ compare R and python side by side
7. Research Software Engineering with Python https://merely-useful.tech/py-rse ..read more
Diving into Genetics and Genomics
2y ago
1. a reproducible workflow. https://www.youtube.com/watch?v=s3JldKoA0zw This two minute video will change your mind on reproducible research
2. Parallel sequencing lives, or what makes large sequencing projects successful https://academic.oup.com/gigascience/article/6/11/gix100/4557140?login=false
3. Common-sense approaches to sharing tabular data alongside publication https://www.sciencedirect.com/science/article/pii/S2666389921002300
4. A Reproducible Data Analysis Workflow with R Markdown, Git, Make, and Docker https://psyarxiv.com/8xzqy/
5. Practical Computational Reproducibility in ..read more