Internet is a beautiful place. I often find useful websites. The purpose of this page on my Vault is to document them all. There are three sections in this page: online portals/apps, downloadable softwares/scripts, and GitHub notes/lists.

Online Portals / Apps

  1. Harmonizome. I learned about this portal when Callie was using it to look up for keratin-1 expression in breast cancer cell lines for our research. It is a portal for integrated knowledge about genes and proteins, developed by researchers at the Mount Sinai.
  2. TOSDR, stands for "Terms of Service; Didn't Read". Sometimes legal words can be confusing, hence TOSDR wants to help you with that.
  3. Bafflednerd has lists of online courses that you can take on subjects revolving around data science, programming, linux, web design, mobile app design, etc.
  4. Seeing Theory helps anyone to understand statistics by using pretty visualizations. Designed by Daniel Kunin from Brown University.
  5. Gridmaster helps you to learn spreadsheet. It is an attempt to reinvent spreadsheet training for a new generation by making it more realistic and approachable.
  6. MapInSeconds turns your spreadsheet data in (beautiful) maps. Just copy your data, paste into their online editor, and download your map.
  7. RAWGraphs is an open source data visualization framework with the goal of making the visual representation of complex data easy for everyone. Source code available on GitHub.
  8. Unplash provides free high-resolution photo. It is just free. No string attached. All photos published on Unsplash are licensed under Creative Commons Zero, no permission required.
  9. Mapillary is a portal for crowdsourced street-level photos. Everyone can join!
  10. Allen Cell Explorer is so freaking cool. It is an online 3D cell viewer. By looking at its Institute Workflow page, I am dreaming to be here. As a confocal technician at RIT, this is so freaking cool.
  11. USAFacts, spearheaded by Steve Ballmer. USAFacts is a new data-driven portrait of the American population, our government’s finances, and government’s impact on society.
  12. Human Cell Atlas, pretty pictures! Here's its cousin, the Human Protein Atlas (specific page).
  13. PubCrawler, an online app that alerts its subscribers for PubMed and GenBank.
  14. SciReader, just like PubCrawler but it seems to have a better UI.
  15. Galaxy, an open source, web-based platform for data intensive biomedical research. Perfect for those who are inclined to use GUI than CLI. Learn more about Galaxy here.
  16. Gapminder, a legacy built by Hans Rosling, with the aim to promote a better understanding of world's statistics and other information about social, economic, environmental, etc. It provides a good visualization of statistics, and indeed a useful tool for studying economics.
  17. ASM's Resource Library, previously available at MicrobeLibrary. A brief cursory skimming through the web page revealed to me that they have quite a nice amount of information about running experiments with bacteria.
  18. Happy Belly Bioinformatics host tutorials on bash and R languages, tailoring for the needs of a new bioinformaticians. Contents are still being written here.

Downloadable Softwares / Scripts

  1. Kite, the smart copilot for programmers. Kite is an ML-based code-completion (that runs in the cloud). It also provides in-line documentation that would give you much more productivity, as opposed to going to Stack Overflow every time you bump into problems.
  2. Lulu by Objective-See. This is an open source firewall for macOS, which is aiming to rival the commercial Little Snitch. They care about the outgoing connection, which reminds me of the WinPatrol for Windows. Follow the discussion on HN here. Linux users can run OpenSnitch, source code available on GitHub.
  3. iStat Menus. Not sure if I ever need to use this on my macOS, but it is useful and helpful to know this application exists. It is a commercial app though. For Windows, the equivalent application is XMeters.
  4. BitWarden. A free and open source password manager. I first used the KeePas (it was okay), then migrated to 1Password (yeah a little bit better). I would love it if I do not have to pay, hence that's where BitWarden comes into play. It is free for 2 users (you & your spouse) with various pricing tiers that are very affordable. On top of that, you can self-host it. As of writing, 2GB of RAM needed (RasPi users would be elated if it goes down to 1GB). Other open source alternatives are Passbolt and Padlock. The software engineer behind BitWarden did an AMA in November 2016.
  5. CMap by IHMC. I was first introduced to this software by a course instructor for Introduction to Microbiology back in 2014. Really useful. Plus, it is free and can be installed both on a server and locally.
  6. Dia. This is a pretty good free tool to create diagrams. I have been using this tool to create what I call "knoweldge map" (with UML) when writing a paper to provide me some sense of structure of my paper. Recommended for scientists who are looking for ways to get organized.
  7. JASP Stats. Think it like the open source version of GraphPad Prism. JASP is supported by the University of Amsterdam.
  8. Orange. This is an open source data-mining software that can be installed via pip or conda. It looks less intimidating to start big data operations. Complex workflow can be designed in this software. Can't wait to actually try this application.

GitHub Notes / Lists

  1. open-source-society/bioinformatics, path to a free self-taught education in Bioinformatics!
  2. jakevdp/PythonDataScienceHandbook, Jupyter Notebooks for the Python Data Science Handbook
  3. Hack-with-Github/Awesome-Hacking, a collection of various awesome lists for hackers, pentesters and security researchers
  4. dloss/python-pentest-tools, python tools for penetration testers
  5. IPGP/scientific_python_cheat_sheet, simple overview of python, numpy, scipy, matplotlib functions that are useful for scientific work
  6. stephenturner/oneliners, useful bash one-liners for bioinformatics
  7. donnemartin/data-science-ipython-notebooks, continually updated data science Python notebooks
  8. Quartz/bad-data-guide, an exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
  9. hangtwenty/dive-into-machine-learning, dive into Machine Learning with Python Jupyter notebook and scikit-learn
  10. fivethirtyeight/data, data and code behind the stories and interactives at FiveThirtyEight
  11. Kickball/awesome-selfhosted, this is a list of Free Software network services and web applications which can be hosted locally
  12. jtleek/datasharing, the Leek group guide to data sharing
  13. jlevy/the-art-of-command-line, master the command line, in one page
  14. rtorr/vim-cheat-sheet, a mobile friendly Vim cheat sheet
  15. ./Bioinformatics training materials. An online project by Mark Duning (CRUK Cambridge) and Thomas Carrol (MRC Clinical Science). I should learn some bioinformatics soon.


  1. Reverse Engineering Malware 101. I am not entirely sure how useful would this course be, but it looks like a fun course to kill time.
  2. Case Studies in Functional Genomics on EdX. I saw one of my friends (currently doing Ph.D. in Australia) took it, so I think I could benefit from this course as well.
  3. The Global Financial Crisis on Coursera. This looks interesting to learn, maybe an extension to my Economic immersion.
  4. Genomics Data Analysis. There are 3 courses within this X-series program. Free of charge. Add $49 for certificate signed by the instructors. I should do all these in my second year of PhD.
  5. Galaxy Training learn how to use Galaxy for bioinformatic analysis here. Galaxy is pretty cool so I would like to get good at it.
  6. Data Science: R Basics, the first course of 9 in HarvardX Data Science Series.

Online Books

  1. Cell Biology by the Numbers, a website that gives the context about understanding numbers pertaining to our cells.
  2. Handbook of Biological Statistics, by John McDonald. This website was derived from his lecture, Biological Data Analysis class at the University of Delaware.
  3. CORE Economics, an online textbook for the brand-new economics for college students.
  4. Fundamentals of Data Viz by Claus O. Wilke. This is the online preview of his book, published with O'Reilly Media. Should help anyone to master dataviz in R.
  5. Advanced R by Hadley Wickham. Caution: this is primarily for users who already know how to use R.
  6. R for Data Science by Garrett Grolemund & Hadley Wickham. The physical copy of this book is available on Amazon. Hadley Wickham is a big name btw.
  7. A Tutorial Introduction to R Aaron A. King et al. It looks like a good place to start also.
  8. Data Science at the Command Line, because it is also important to learn command line tools when it comes to preparing data for analysis.
  9. Analysis of single cell RNA-seq data because single-cell RNAseq is the hype-train of the year.


  1. Getting Genetics Done by Dr. Stephen Turner. This blog looks pretty much dead today (March 15th, 2018). But it has some good information in there.
  2. Variance Explained. I learned how to interpret a p-value histogram here. Very useful.

YouTube Channels

  1. Maarten Schrader with tutorials on Lightroom for photographers.
  2. PowerPoint School, suggested by Qyira Yusri on Twitter as the YouTube channel to make impressive PowerPoint presentation.
  3. Logos By Nick was the reason how I am kind of proficient in Inkscape nowadays.
  4. Illustrator for Beginners. Basically what the name tells you.
  5. Egee, a YouTube channel by Brian. Great content on Linux/FOSS.
  6. tutoriaLinux, a good channel learning about Linux commands. Found this somewhere on Reddit.
  7. Real Engineering, a pretty good channel on engineering and I enjoy it so damn much.