September 3rd
Worked on some personal administration stuff, registering for fall, going through emails, scheduling committee meeting, etc.
Meet with Mac tomorrow to go over latest r.phil results so did some work to prep for that: I went ahead and moved forward on joining the SPIDs for the DEG entries that had them (retrieved SPIDs from blast). Of the remaining ~12 DEGs, only 5 of which were characterized. For these 5, I used the annotated gene table on NCBI web interface to get the gene name which I manually entered into UniProt to get similar genes with GO, tissue, development, and induction information. Where possible, I also filtered the taxonomy for the UniProt results to include only bivalves or, where no bivalve genes showed as matches, mollusks. Genes that did not have hits for bivalves or mollusks had less than 35 total matches when no taxonomy filter was applied. The links to the results were then added to a ‘notes’ column in the master dataframe for reference come literature review.
September 4th
Last we discussed, Mac and I also talked about the large number of reads that are being “sucked up” by ribosomal hits, so I went back to my full counts matrix to pull the stats and counts for all genes our samples mapped to to get a list of the ribosomal RNA.
Met with Mac to go over updated results and next steps. She gave me some really helpful insights on what could be going wrong with my blast and what steps I need to re-work through to troubleshoot:
Something going on in the join causing me to lose some hits/LOCs, OR
The fasta is wrong somehow and to check the number of sequences in the fasta, OR
My blast commands need to be checked / changed – If it doesn’t hit my e value, is it excluded or left with NAs ? Want to make sure nothing is getting dropped
I’m out of town for the next few days so this will have to wait til I can get back into the office so I can have access to Raven, but the results I have currently are enough to move forward.
September 6th
Very minor updates to google drive with my draft of methods/results, and the master dataframe. After my discussion with mac wanted to double check that the entries i did have SPIDs for, that the SPIDs even made sense, so I checked the UniProt hit against the annotated genome hit and they checked out, so that is good.
September goals and notebook update.