November Daily Posts


Megan Ewing


November 8, 2023

Daily notebook updates for November

Nov. 1 - Gearing up for clam work

Wrapping up with blast tutorial, pushing issues, moving forward:

Finally wrapped up the blast tutorial and got through the visualization of top 10 genes, which was sick. I definitely need to refresh on how to even read the results of a blast table though, just so I have a better understanding of what things like e values mean. Next/final step in that regard is to push the repo to git hub, but didn’t get around to that today since I had that issue with commiting large files and need to recreate the repo before I can re-commit and push. That will be first order of business tomorrow before lab meeting prep then lab meeting.

More on looking forward, I am about to start analysis with the clam data Mac shared. To prep for this, I spent some time today reviewing the repo and workflow I used for coho data analyses in 546 a couple years ago, to 1) refresh, 2) see what parts of the workflow I could use for this analysis, and 3) identify what things I need to improve upon in terms of reproducability (my code comments and readme docs could be better). As I continue reviewing that this week, I will start crackin’ into the clam data.

Nov. 6 - Homework and Planning

Not much lab-focused stuff today. Had class and got ahead on some assignments for the week. After that, did some planning for what to work on this week:

  • Tuesday:

    • Create new repo for blast tutorial so I can push to github and close the issue.

    • Start repo for clam data

    • Update notebook format using the TOC method we talked about in lab meeting last week

  • Wednesday:

    • Clam data work

    • Read DEI chapter/s for lab meeting

  • Thursday

    • Finish any remaining assignments for classes for the week (HW 1 for 553 main priority)
  • Friday

    • Veteran’s day / no qsci lab meeting

    • Work

Nov. 7 - Finished geoduck tutorial, start clam repo

Finished what I had hoped to today! Created/pushed the new repo for blast tutorial and closed that issue (finally). I created the repo for clam data and added a readme to each folder, and have started using the single-doc / TOC format for notebook updates.

Aside from that, I also added my November goals to my notebook, and did school stuff (class and homework).

Nov. 8 - Clam RNAseq work

Pretty chill day – some homework in the morning before class.

After class, I got started on the clam rnaseq analysis. Mostly just retrieving the files from Gannet, which took some time. While that loaded I was able to start mapping out the next step in the workflow, which will be checking md5sums to confirm file integrity, then using FastQC and MultiQC to look at the quality of our reads. I hope to run these steps tomorrow so I can start next steps (or start some reading to understand what the next steps even are) come Monday. Based on my work with Coho RNAseq data back in 2021, the next steps after fast/multiQC would be:

  • trimming

  • kallisto to quantify gene expression using the provided reference genome included in github issue.

  • use DESeq2 to visualize DEGs

  • run blast to identify DEG functions

  • join the DEG stats from DESeq2 and the identified DEGs

The last two I’m a little unsure if those are the right steps for this specific data. The coho analysis was comparing two treatments, and I’m just not entirely sure if that’s the type of analysis we are looking for here. But that is where the reading / literature hunt will come in on Monday.

While the files were loading, I read Race After Technology - Chapter 1 to prep for tomorrow’s lab meeting.

Nov. 15 - FastQC / MultiQC

Woof. Admittedly I haven’t gotten around to much lab related stuff early this week, but was able to sit and get through the fastqc tonight. Multiqc will be in the morning. I attempted, but doing it on Raven makes it a little funky since the multiqc . command I used last time doesn’t seem to work the same. Also had a little trouble running the md5sum check so I skipped for now. But luckily we’ll be goin over things in lab meeting tomorrow :)

Nov. 16 - MultiQC, Next Steps

Goals for today

  • Finish MultiQC and md5sum check

  • Draft next steps code

  • Get ahead on Lab for qsci and hw for fish553

Nov. 27 - MultiQC, Kallisto Start, School Stuff

Looked at MultiQC results and moved on to Kallisto. Currently reading through a Kallisto tutorial to refresh my memory in addition to looking to old code (that I did not do the best at annotating).

Registered for classes and outlined my day-to-day schedule for school/lab/work stuff next quarter. Signed up for 521: Research Proposal Writing (req’d) and 510: Current Topics in Genetics and Physiology, which I figured will be of relevance for my project.

As November comes to an end my goals for this week are to:

  • General: re check my November goals to see what’s left I need to wrap up

  • Tuesday

    • finish kallisto

    • start/finish deseq2 visualization

    • read papers for hot topics course (co moderating discussion this week)

  • Wednesday

    • blast on clam data

    • join DEG list and expression levels (would love to get a heatmap a start to DEG function ID by EOD wednesday)

    • meet with co moderator for hot topics

    • pre lab meeting reading

  • Thursday

    • lab meeting

    • finish up homework for the week

    • attend safs seminar

Nov. 28 - Kallisto, DESeq2

Kallisto code is finished but still running, so I wasn’t really able to do the deseq2 step like anticipated, but I did write out the code so as soon as Kallisto is finished up it should be good to go. While I may not be able to start and finish the blast tomorrow, I should be able to get the heatmap and volcano plot like I had hoped.

For other stuff, I worked on readings for hot topics and got coffee with my SAFS peer mentor (we meet every ~3 weeks).

Nov. 29 - Kallisto troubles and learning ggplot

If Kallisto was a person, this morning I was damn near getting into a bar fight with them.

Not really but ran into some issues with the output files not showing what I wanted so spent some time trouble shooting that and didn’t get to do the DESeq2 like I had hoped :(

Other than that, worked with my co moderator on a plan for our hot topics discussion tomorrow, and then had class this morning where I learned about how to use ggplot (just in time for our plot focused lab meeting tomorrow!) Here is the code for the plot I created in class:


ggplot(data = diamonds, 
       mapping = aes(x = carat, y = price, shape = cut, color = color, alpha = clarity, size = depth)) +
       geom_point() + 
       ggtitle("Diamonds are Expensive") + xlab("why? (carat)")+ ylab("how expensive? (price)") + 
       theme(plot.background = element_rect(fill = "hotpink"), 
       panel.background = element_rect(fill = "lavender"),
       legend.key.size = unit(1, 'cm'), #change legend key size
       legend.key.height = unit(.5, 'cm'), #change legend key height
       legend.key.width = unit(.5, 'cm'), #change legend key width
       legend.title = element_text(size=4), #change legend title font size
       legend.text = element_text(size=4)) + 
       scale_shape_manual(values = c(21, 22, 23, 24, 25)) + 
       scale_color_manual(values = c("pink", "pink2", "pink3", "pink4", "hotpink", "hotpink2", "hotpink3")) + 
       scale_y_continuous(breaks=seq(0, 20000, 2000))