**This is my post for the week of November 15th – 21st; I meant to post this on Friday Nov 21st but did not get the chance to**
Unfortunately the past week has been a rather inactive week in terms of progress on my main project. I still have not had the opportunity to test the datasets utilizing R/Python/Tableau; while a class meeting was held on Wednesday Nov. 18th, the issues I’ve had with accessing the dataset have largely hindered my opportunities at analysis, and to my knowledge, Prof. Davis was not able to access the datasets utilizing any available software, either. At this point, I am planning to speak more with Prof. Davis at our next class meeting, to discuss whether it may be better to utilize only the two datasets that I’ve been able to open, since the semester is coming to an end and it may be wise for me to simply begin work on my analysis.
This week, there was no zoom meeting held, due to Veteran’s Day falling on our usual meeting day (Wednesdays). I also unfortunately did not get the chance to work on any of the previous goals I had set (i.e. opening the large dataset in Tableau, Python or R), nor did I hear back from Professor Davis regarding whether or not he was able to successfully open the large datasets in any software or program. In the time between now and the next class meeting, I’m hoping to finally try to access the datasets using one of the aforementioned programs, and speak to Professor Davis more about the topic in our next class.
This week, I attended the class meeting on Wednesday, where myself and Professor Davis again briefly discussed the difficulty I had been having with opening the large data sets. I have not been able to make much progress with that, and am still not able to get the datasets opened in Excel or Sheets; I have not yet attempted opening them in R, Python, or in Tableau, which I am planning to attempt in the coming week. Professor Davis also said he would also attempt to try opening them in an analytic software, since he had not had the chance to so far; hopefully by the time of our class next Wednesday, one of us will have had some luck in opening the files, otherwise I’ll probably just end up using the Prey Length dataset (which, as I said before, would still offer a variety of different options).
I submitted my mid-semester progress report early this week, which I completed and submitted without any issues. I was also able to attend the class meeting on Wednesday, Oct 28, during which I discussed with the professor on the recent changes in my project plan and subsequent work I’ve done with the datasets. I mentioned that I had to contact a researcher from the National Marine Fisheries Service’s Northeast Fisheries Science Center for assistance with obtaining access to the new datasets, as well as the difficulties I’ve had with opening the two large datasets in an analyzing software (Excel, Google Sheets, etc). Professor Davis asked me to email him my datasets to see if he could try figuring out a way to open them for analysis, unfortunately the two large datasets were too large to even be sent using Outlook, so I instead sent a link to the download page, where the files can be downloaded as a ZIP folder and extracted to yield the CSV files. After the class meeting, one of my classmates, Shaelyn, recommended to me that I possibly try opening the data files using Tableau (which we downloaded for our DSC201 class), which is an idea that I hadn’t even thought of. I haven’t attempted it yet, but I was very appreciative of Shaelyn’s assistance, and early next week, I’m planning to work on trying to open the files in Tableau. If that is unsuccessful, I’ve also considered possibly using R or Python to access the data without needing to open the file (which I should be able to do, since I can access the data in a text editor and therefore can identify the column labels and attributes).
My mid-semester progress report is viewable as a PDF file below. At the bottom of the report, a link and instructions are available to download and view the raw data itself.
This week, I was able to attend the meeting held on Wednesday, during which I was able to discuss with the Professor and other students regarding both my dataset plans and general project plans, as well as some discussion regarding the mid-semester report, which we should submit soon. Initially, I had a handful of datasets bookmarked on my desktop, mostly pertaining to climate change and/or marine science, and my initial goal was to utilize one revolving around sea level rise in the Northeastern United States, however I soon encountered an issue when downloading some of the files, namely that most of them contained only metadata, and not the raw data itself. This turned out to be the case for the majority of datasets on sea level rise that I had, so as a result, I had to shift topics slightly; instead of focusing on climate change, I instead decided to utilize one of the databases I found on fisheries reports from the Northeast United States. This wasn’t an issue, since fisheries science is another topic that I am fascinated in, and it’s one that I find it just as interesting as climate change.
I actually had to email one of the researchers at the Northeast Fisheries Science Center in Woods Hole, MA, since I was initially having difficulties with obtaining the data; fortunately the researcher I contacted responded quite quickly, and he was able to provide some assistance in accessing the raw data. The database actually consists of four individual CSV files: one serving as a reference for the name/taxonomy of each species, one documenting physiology/habitat data for prey species, one documenting physiology/habitat data for predator species, and one documenting length/sex data for prey species; all the data was obtained from various fisheries surveys in the Northwest Atlantic (off the coast of the Northeast United States). Each of the datasets is quite large; the datasets on Prey Data and Predator Data were actually too large to open in Excel or Google Sheets (I am able to view them using a text editor, but that will not be very sufficient for actual analysis), so as a result, I will likely end up using the Prey Length dataset for my analyses, since it still offers a huge variety of data to work with, including the year/cruise of sample, species ID, the length of the specimen, the sex of the specimen, etc. My current goal is to work on the mid-semester report and finish it by the next class, and begin evaluating what types of questions I want to investigate using my datasets.
During the most recent week, there was no zoom meeting held, so I did not have any chance to directly discuss my project status with the professor or the class. However, a video lecture was uploaded by the Professor discussing some statistical techniques, which I watched, and I was also able to watch the lecture recording from last week (Oct. 7th), since I wasn’t able to attend that meeting at the time. At this point, I have a handful of different datasets bookmarked and have been contemplating which one to finally go with, though I still haven’t made any final decision yet. In the time between now and the next meeting, I am hoping to narrow down the choices as much as possible and ideally decide on one specific dataset by the end of the next meeting. Additionally, I’m beginning to prepare for the mid-semester writing assignment, which I assume will be assigned at some point in the coming weeks, since this previous week was the seventh week of classes and therefore about the mid-semester point.
**This is my post for the week of October 4th – 10th; I meant to post this on October 10th but did not get the chance to**
For our fifth full week of classes, I unfortunately was not able to accomplish as much as I’d hoped. I was not able to make it to the Zoom meeting on October 07th, and unfortunately the recording is not yet available on the Cloud, so I was not able to see what content or info I may have missed; I am currently hoping to be able to view it and catch-up as soon as it is uploaded. Fortunately, I was able to locate a handful of datasets online, including several on different metrics related to climate change (sea level rise, glacial melt, temperature increase, etc), and I am planning on sifting through them to decide which one would be best to use for my project. I am currently aiming to attend the next upcoming lecture on Wednesday, October 14th, to further discuss the different datasets I’ve located and begin the process of selecting one, as well as to get in touch with the Professor on the difficulties I’ve had with viewing last week’s lecture recording.
During our fourth week of class, I was unfortunately very preoccupied with work from some other classes (as I had an exam last week and another one next week), so I have not yet completed the goal I set last week of locating a dataset on sea level rise data. I made a plan last week to either locate a relevant dataset online, or get in contact with a researcher or professor from one of the local research institutions (SMAST, Woods Hole Oceanographic Institute, etc), and while I have not yet completed that goal, I am hoping to do so by mid-next week. Despite this minor setback, I was able to attend the class session on Wednesday, during which we discussed some applications of the z-score in analyzing the spread, skewness, and “peak”-ness of given data, as well as how to evaluate such metrics using RStudio.
For our third week of classes, I attended the zoom discussion/lecture on Wednesday, Sept 23rd, where we described and discussed the different ideas we had for our data sets, and also went over some information on basic analysis of data (identifying the mean/median, variation between the two, different measurements of spread, etc). For my data set, I settled on some topic related to climate change, likely measurements on sea level rise; during the meeting, we discussed different ways of finding some data related to this cause, with some classmates recommending different websites and Professor Davis planning to contact some researchers at UMassD’s SMAST. During the time since the class last week, I began looking at the textbook and skimmed through some sections of the first couple chapters, however the PDF of notes last week was not uploaded, so I have not yet reviewed it. Over the next week, I am planning on looking through more datasets related to sea level rise and potentially getting in contact with some researchers from SMAST or one of the research institutions in Woods Hole (e.g. WHOI, NOAA, etc), and I am also planning to read more in the textbook and review the PDF’s for this week and/or last week, once they are made available.