Claudia Tang - Summer 2023 Research Journal
Link to presentation:
https://docs.google.com/presentation/d/1n8cE7JZm9GBM4JQMBblOQtkplrr9P952LKi4qjwXids/edit?usp=sharing
- Prepare to have your mind blown.
May 30, 2023: I read papers today.
May 31, 2023: I started the ROOT tutorial. Power outage woohoo.
June 1, 2023: Finished ROOT tutorial for Python. Read more papers and had call with Ari to understand future goals.
June 2, 2023: Reading papers and watching video to understand attention mechanism.
June 5, 2023: Watching/doing
PyTorch tutorial after setting up environment called testenv within tehanu (temporary until we move to use Shevek).
June 6, 2023: More
PyTorch tutorial. Watched great video (
https://www.youtube.com/watch?v=c36lUUr864M&ab_channel=PatrickLoeber
).
June 7, 2023: Finished
PyTorch tutorial and started reading over Ari's code. Made weekly presentation slides.
June 8, 2023: Watched a loooot of videos to learn more about the attention mechanism in transformers and the difference between self vs cross attention and multi-head attention. A little review of Ari's code (it's starting to make sense????). Slight questions about notation and what parts of the transformer we'll really be implementing.
June 9, 2023: Call with Ari to go over his code. REU lectures + lab tour. Log into and set up Shevek machine to be able to open up jupyter notebook server from it. I want to thank Colin for this momentous achievement.
June 12, 2023: Call with
NuSTAR group and Collaborators about previous X-ray Halo project from previous summer/semester. Setting up Shevek more to try to utilize the GPU instead of CPU. Made a guide to find shevek.
June 13, 2023: Began looking into and coding how to transform a hdf5 file into a format that can be usable by Ari's code.
June 14, 2023: Had meeting with Ari to go over different methods to break down hdf5 to more usable forms. Had group meeting. Changed from using h5py to pytables as manual ways to manipulate data -> got it to break down from group formatting using ".walk_groups". Ari recommends using dl1-data-handler to do it more automated.
- My understanding of attention mechanism: "QKV -> if you have query of “fluffy white cat” the algorithm will look through all its keys and be like oh this video’s title also has cat and output a relatively high value because what you’re looking for and what this video seems to have are similar. Now this Q*V gives you a mapping and when you multiply this mapping by the actual values associated with those keys (so the underlying video associated with that title), the output that you get returned will be an ordered list of all those values/videos in order of most relevant to your search based on the key words you searched for. And that’s basically what attention is, it’s telling you how similar what you want and what it has are and then serving you all the actual data associated with its keys in such a way that you know how relevant each piece of information actually is. it is a way to survey how similar things are to what you want or what you are and combine/provide the information from the values in some weighted way."
June 15, 2023: Worked in the machine shop cataloging for most of the day. Worked on transforming the hdf5 formatting to a usable format. Utilize dl1_data_handler (git clone, cd, git checkout v0.10.11, pip install -e .). Understanding parts of Ari's code. Able to plot the data now for the LSTs only, need to be able to do for small ones (issue bc of non-sequential naming order).
June 16, 2023: Helped Ceci make computer do stuff. Missed Ana as she recuperated from illness. Looked over the code for generating the fakeIACT dataset and determined what would be different between using the fake data and the simulated data. Made a CTADataset class. Learned how to use a dongle and connect to a desktop.
June 20, 2023: Coding coding. We were able to make a dataset class to load the simulated data and got the stereo reconstruction code to work with it. Link for runs:
https://docs.google.com/document/d/1laeUW_R9TvvSSxgs6ZXxm37WJkBIqetK841qxCOUJe4/edit?usp=sharing
June 21, 2023: Group + Ari meeting. Ran efficiency analysis using num_workers parameter and found that 3 workers seemed to be the most efficient given the overhead and made the runs much faster. Added in seeding for reproducibility.
June 22, 2023: Catalogued machines. Worked more on loading CTA data into stereo recon model.
June 23, 2023: Found all the event counts in each file for LST vs MST and properly split up the data into the 80% and 20% split for training and testing. Goal to try to somehow combine the LST and MST data eventually. Reshmi + Colin talks.
June 26, 2023: Figured out how to plot loss and accuracy vs time for both the training and validation datasets so we now can use that to visualize how our model is doing. Tried to adapt notebook script to be run in terminal (python 2.7) on shevek to allow it to run in a screen or something. ML call.
June 27, 2023: Able to plot accuracy and loss vs time for just proton and gamma to determine whether or not the model is more equipped to handle one type of event better than the other. Did indeed find that it was better for gamma for the most part. Also learned to make ROC-AUC curves and how they worked. ALSO FINALLY MADE OUR IPYNB INTO A .PY SCRIPT WE COULD RUN IN A SCREEN IN TERMINAL SO WE DIDN'T HAVE TO WATCH OUR COMPUTER RUN FOR 2+ HOURS WITH LESS REST THAN ME DURING FINALS SEASON.
June 28, 2023: Ari + Group meeting. Mainly just doing runs and fixing any errors as prepped the code to be able to perform an ablation study and compare our model to other baselines. (Note: ran from 10pm at night to 10am the next day....?).
June 29, 2023: Tried to replace CNN with a normal linear layer. Had issues due to dimensions. Ran a run of MST without positional encoding, got to last epoch after more than 12 hours of running, but failed due to cuda memory issues? MUST FIX.
June 30, 2023: Was able to make a version of code (v6) without CNN that didn't immediately run into issues in
StereoClassifier class. Haven't tested yet because shevek is preoccupied.
v3 = multiple models, v4 = no positional encodings, v5 = testing different number of attention layers, v6 = no CNN, v7 = learning rate, v8 = Energy dependence, v9 = images for high and low loss for attention and lstm, v10 = layers and learning rate combined, v11 = amount of data anal.
July 5, 2023: Over weekend, ran a version of v6 for LST and MST and got ROC graphs for both. Also ran some previous code ablation versions that stopped working in the middle to get their graphs for MST. Have all but v4's MST ROC graph. ARI HAD A BABYYYYY! output_v7_trainVvalid.txt (just to get a plot of how LR affected loss)
July 6, 2023: Wrote up a code to make ROC curves to show which learning rate was best optimized for our code. Started looking at plotting our model's accuracy based the particle energy. ouput_v7.txt
July 7, 2023: Ran code overnight to get ROCs to find the optimal learning rate. 8e-4 seems to have slightly better results, but not major difference. Might consider changing to 8e-4 starting at v8 (Energy). Ran a run for energy. output_v8.txt
July 10, 2023: Finished a run for MST v7 which took about 3 days. Showed that in MST, Attention does slightly better than LSTM and that 8e-4 provided better results for learning. Analysis of v8 proves that at higher energies, the analysis is better.
July 11, 2023: Ran a v8 MST and renamed previous run to _largerange. New version is because CTA spans a large energy range in general so I included them all -> cluttered, but LST and MST specifically have a very small range of sensitivity. Rewrote code for v6, v7 and v8 and tested to try to make it so that could print out the loss and ACC of test set and be able to make a separate ROC for valid and test data.
July 12, 2023: Ran a v7 LST with new print outs. Made a v9 code that basically prints the 10 events with the highest loss (5 gamma and 5 proton) with the goal of being able to find patterns between times when the model was super wrong about when it mislabels an event.
July 13, 2023: Waited for the v7 LST... taking long... Prepared for journal club and the SRI poster. Ran a v8 LST. Called ROCv8_LST_Test.png and ROCv8_LST_Validation.png along with a v8.txt with the outputs and a cta_v8_LST_finaloutputs.txt incase v8.txt failed.
July 14, 2023: Gave a talk during journal club. Ran a v3 (multiple models) with same naming notations as above after editing code to be able to print the test statistics and ROC of test and validation set. Started doing the same for other versions in preparation of running all versions for LST over weekend.
July 17, 2023: Starting filling out and formatting big spreadsheet of all final results so far, ran more runs of things using a script.
July 18, 2023: More runs (MST) so slowly filling out big spreadsheet. Need to run v9 MST due to weird location of caption. Started working on SRI poster instead.
July 19, 2023: Running v3 MST and v6_ana MST (multiplicity) in background. Trying to fix up code for v9 to properly extract the loss values after call with Ari. New goal to run a version of code that checks different learning rates with different layers to find where the values get optimized. Symposium talk.
July 20, 2023: Machine workshop. Made a working v9 code.
July 21, 2023: Made a v10 and v11 code and working out kinks. v10 will be to make a graph per number of attention layers for the learning rate ROCs. v11 will be a run that changes the amount of data used in the data loaders. Going to run LST for 9, 10, 11 over weekend after check that they all run without issues on smaller subset of data.
July 24, 2023: Got results of v9. Barbeque.
July 25, 2023: Got results of v10 run. Realized something wrong with v11 code and started to try to debug. Prepared slides for symposium.
July 26, 2023: Symposium. Fixing v11 and working on poster for SRI poster session.
July 27, 2023: Finished machine workshop cataloging. Redo most of the poster formatting. Finally fixed up v11 and started running it (LST).
July 28, 2023: Alumni panel meeting. Ran a v11 MST.
July 31, 2023: waiting on v11 MST. Tried to figure out if images from v9 failed gamma run matched with actual gamma events in files (looks like they have muon rings and not sure if this is a code issue or a simulation issue).
Aug 1, 2023: Call with Ari, new future steps. Probably to look at MLP next and optimizations
Comments