Tweet Analysis of the Coronavirus
Working with Kaitlyn Chen and Jessie Lee, our Cornell Data Science onboarding project aimed to understand a global population’s response to coronavirus, especially in the top 10 countries with the most confirmed cases. We analyzed what topics people were talking about (e.g. economic, humanitarian) and the polarity or subjectivity of their reactions. Our source for these reactions was Twitter.
Technologies: R, Python, Pandas, d3.js, Matplotlib, scikit-learn, NLTK, TF-IDF
Using Assembly Graphs to Identify Strain Variants in Hot Spring Metagenomes
Using a variety of bioinformatics tools, Marie Crane and I researched this topic at University of Maryland's Center for Bioinformatics and Computational Biology under the guidance of Dr. Mihai Pop. Metagenomic data was collected from two hot springs in Yellowstone National Park, across a temperature gradient. We specifically investigated the bubble motifs in their genome assembly graphs and their biological significance in the context of this data.
Technologies: Bash, R, Python, BLAST, MetagenomeScope, MetaSpades, MetaCarvel