Thirteenth Week – December 05 [Final Report]

This week was certainly a busy one. There was no class held on Wednesday unfortunately, however this week I successfully completed the culmination of the course: I performed my analysis and typed up my final report, which I attached below as a PDF. Everything that I did this week is described in detail in the report, so typing it again would be redundant, but needless to say, it was time-consuming, but also very fascinating. I learned a lot through both this project and this course as a whole, and I greatly enjoyed both the course as well as]having Prof. Davis as a professor this semester.

*please note that, while the PDF itself is quite long, only about half of it is my actual report, while the remainder are screenshots of the Python code that I typed in JupyterLab throughout my analysis*

Eric Faith, MTH231, Final Report

Twelfth Week – November 28th

*Note that this post is for the week of November 22nd to November 28th; I had this typed up and prepared to post this on the 28th but wasn’t able to post it until the following day*.

This week has been met with surprisingly positive news; I was successfully able to access all data utilizing JupyterNotebook, meaning I should be able to analyze all desired data without much issue. While there was no class session held last week due to Thanksgiving break, I experimented with accessing the data on my own time over the weekend, and was able to utilize Jupyter to open the data for analysis with ease. In retrospect, I wish I attempted this earlier on, since it worked much easier and more efficiently than expected, though there are still almost two weeks left in the semester to develop a specific question and perform analysis to answer the question. My goal for the next few days is to develop a specific question (which shouldn’t be too terribly difficult, since I already have several ideas in mind) and discuss the current status with my professor, and then perform the analysis of my data in Jupyter and begin writing up the final report by the end of the week.

Eleventh Week – November 21st *retroactive*

**This is my post for the week of November 15th – 21st; I meant to post this on Friday Nov 21st but did not get the chance to**

Unfortunately the past week has been a rather inactive week in terms of progress on my main project. I still have not had the opportunity to test the datasets utilizing R/Python/Tableau; while a class meeting was held on Wednesday Nov. 18th, the issues I’ve had with accessing the dataset have largely hindered my opportunities at analysis, and to my knowledge, Prof. Davis was not able to access the datasets utilizing any available software, either. At this point, I am planning to speak more with Prof. Davis at our next class meeting, to discuss whether it may be better to utilize only the two datasets that I’ve been able to open, since the semester is coming to an end and it may be wise for me to simply begin work on my analysis.

Tenth Week – November 16th

This week, there was no zoom meeting held, due to Veteran’s Day falling on our usual meeting day (Wednesdays). I also unfortunately did not get the chance to work on any of the previous goals I had set (i.e. opening the large dataset in Tableau, Python or R), nor did I hear back from Professor Davis regarding whether or not he was able to successfully open the large datasets in any software or program. In the time between now and the next class meeting, I’m hoping to finally try to access the datasets using one of the aforementioned programs, and speak to Professor Davis more about the topic in our next class.

Ninth Week – November 6th

This week, I attended the class meeting on Wednesday, where myself and Professor Davis again briefly discussed the difficulty I had been having with opening the large data sets. I have not been able to make much progress with that, and am still not able to get the datasets opened in Excel or Sheets; I have not yet attempted opening them in R, Python, or in Tableau, which I am planning to attempt in the coming week. Professor Davis also said he would also attempt to try opening them in an analytic software, since he had not had the chance to so far; hopefully by the time of our class next Wednesday, one of us will have had some luck in opening the files, otherwise I’ll probably just end up using the Prey Length dataset (which, as I said before, would still offer a variety of different options).

Eighth Post – October 30th

I submitted my mid-semester progress report early this week, which I completed and submitted without any issues. I was also able to attend the class meeting on Wednesday, Oct 28, during which I discussed with the professor on the recent changes in my project plan and subsequent work I’ve done with the datasets. I mentioned that I had to contact a researcher from the National Marine Fisheries Service’s Northeast Fisheries Science Center for assistance with obtaining access to the new datasets, as well as the difficulties I’ve had with opening the two large datasets in an analyzing software (Excel, Google Sheets, etc). Professor Davis asked me to email him my datasets to see if he could try figuring out a way to open them for analysis, unfortunately the two large datasets were too large to even be sent using Outlook, so I instead sent a link to the download page, where the files can be downloaded as a ZIP folder and extracted to yield the CSV files. After the class meeting, one of my classmates, Shaelyn, recommended to me that I possibly try opening the data files using Tableau (which we downloaded for our DSC201 class), which is an idea that I hadn’t even thought of. I haven’t attempted it yet, but I was very appreciative of Shaelyn’s assistance, and early next week, I’m planning to work on trying to open the files in Tableau. If that is unsuccessful, I’ve also considered possibly using R or Python to access the data without needing to open the file (which I should be able to do, since I can access the data in a text editor and therefore can identify the column labels and attributes).

Seventh Post – October 23rd

This week, I was able to attend the meeting held on Wednesday, during which I was able to discuss with the Professor and other students regarding both my dataset plans and general project plans, as well as some discussion regarding the mid-semester report, which we should submit soon. Initially, I had a handful of datasets bookmarked on my desktop, mostly pertaining to climate change and/or marine science, and my initial goal was to utilize one revolving around sea level rise in the Northeastern United States, however I soon encountered an issue when downloading some of the files, namely that most of them contained only metadata, and not the raw data itself.  This turned out to be the case for the majority of datasets on sea level rise that I had, so as a result, I had to shift topics slightly; instead of focusing on climate change, I instead decided to utilize one of the databases I found on fisheries reports from the Northeast United States. This wasn’t an issue, since fisheries science is another topic that I am fascinated in, and it’s one that I find it just as interesting as climate change.

I actually had to email one of the researchers at the Northeast Fisheries Science Center in Woods Hole, MA, since I was initially having difficulties with obtaining the data; fortunately the researcher I contacted responded quite quickly, and he was able to provide some assistance in accessing the raw data. The database actually consists of four individual CSV files: one serving as a reference for the name/taxonomy of each species, one documenting physiology/habitat data for prey species, one documenting physiology/habitat data for predator species, and one documenting length/sex data for prey species; all the data was obtained from various fisheries surveys in the Northwest Atlantic (off the coast of the Northeast United States). Each of the datasets is quite large; the datasets on Prey Data and Predator Data were actually too large to open in Excel or Google Sheets (I am able to view them using a text editor, but that will not be very sufficient for actual analysis), so as a result, I will likely end up using the Prey Length dataset for my analyses, since it still offers a huge variety of data to work with, including the year/cruise of sample, species ID, the length of the specimen, the sex of the specimen, etc. My current goal is to work on the mid-semester report and finish it by the next class, and begin evaluating what types of questions I want to investigate using my datasets.

Sixth Post – October 16th

During the most recent week, there was no zoom meeting held, so I did not have any chance to directly discuss my project status with the professor or the class. However, a video lecture was uploaded by the Professor discussing some statistical techniques, which I watched, and I was also able to watch the lecture recording from last week (Oct. 7th), since I wasn’t able to attend that meeting at the time. At this point, I have a handful of different datasets bookmarked and have been contemplating which one to finally go with, though I still haven’t made any final decision yet. In the time between now and the next meeting, I am hoping to narrow down the choices as much as possible and ideally decide on one specific dataset by the end of the next meeting. Additionally, I’m beginning to prepare for the mid-semester writing assignment, which I assume will be assigned at some point in the coming weeks, since this previous week was the seventh week of classes and therefore about the mid-semester point.

Fifth Post – October 10th *retroactive*

**This is my post for the week of October 4th – 10th; I meant to post this on October 10th but did not get the chance to**

For our fifth full week of classes, I unfortunately was not able to accomplish as much as I’d hoped. I was not able to make it to the Zoom meeting on October 07th, and unfortunately the recording is not yet available on the Cloud, so I was not able to see what content or info I may have missed; I am currently hoping to be able to view it and catch-up as soon as it is uploaded. Fortunately, I was able to locate a handful of datasets online, including several on different metrics related to climate change (sea level rise, glacial melt, temperature increase, etc), and I am planning on sifting through them to decide which one would be best to use for my project. I am currently aiming to attend the next upcoming lecture on Wednesday, October 14th, to further discuss the different datasets I’ve located and begin the process of selecting one, as well as to get in touch with the Professor on the difficulties I’ve had with viewing last week’s lecture recording.