Six months of PhD: a retrospective

In this post, I am going to reflect on my past six months as a PhD student: what went right, what went wrong, what was good, and what was bad.

Overview

For a detailed description of the topic of my PhD, see this post I wrote some time ago. Briefly, I study new ways to develop personalized vaccines, targeted to specific patients and diseases. For this semester, I am also in charge of the exercises for a course on Deep Learning offered to Master students of the statistics program of my university. I am employed at the LMU university and a guest scientist at the Institute for Computational Biology (ICB). I have two supervisors coming from those institutions, who help me with the methods (machine learning and discrete optimization) and subject matter (biology and immunology). This is the scheme devised by my graduate school, MuDS.

Summary

I actually start with the conclusion, leaving the details for later. In general, I am really enjoying the experience: life as a PhD student aligns perfectly to my values of autonomy, competence, and reasonableness. I am also lucky to have two very good supervisors that are knowledgeable and not bossy.

So far, the good things are:

Research diary: it really helps me stay focused and organized. I am glad I started since the beginning. I am also happy of the tools I use, as they give me a lot of freedom to organize and re-organize things as I see fit.
Progress: I already submitted a paper and working on the next one. I also collaborated on a third project, although I had a minor part, which is going to be published soon. My supervisors are idea factories, so work is never lacking.
Teaching: I really enjoy the teaching experience, especially being in the classroom and interacting with the students. Creating the exercise sheets does not take that much time, but I got mixed feedback on them. On the one hand, they are well done and useful, on the other hand they are too hard and too long. This latter point is mostly due to the way they are presented, and giving more hints should be enough.

Some things which are not so good:

Reading: it took me some time to get into systematic reading habits, mostly because, entering a new field, I had to look for good sources. I also was not very organized in how I learned about this new field, mostly reading bits and pieces whenever I needed them. However, I have improved on this and I am satisfied with the system I found.
Collaborations: I have been working alone for the two major projects I described above, and I my part in the soon-to-be-published one was during a five-days hackathon. I have always been a lone wolf, and I do not mind working alone, but I should really improve my collaboration and team working skills. Not to mention that I could learn tons of things from people I collaborate with. I also wish to improve my non-existent leadership skills, which could only happen by being in the driving seat for a project.
Belonging to two research groups: it has positive and negative sides, mentioning it here does not mean that I am not satisfied and want to change this. The problem with it is that it requires more interactions with more people. Being an introvert by nature, a lot of social interaction can be a drain on my energy. It also means more meetings. The positive side is, of course, being exposed to a very diverse set of topics and people (statisticians and biologists), which I find very intellectually stimulating.

Things that caused me some difficulty:

Method evaluation: before starting this PhD, I thought that most of the effort went into thinking and developing a new method, then some effort into performing experiments, and a little to write and publish the paper. Was I wrong! Turns out that having new ideas and improving on the state of the art is not a big deal, and the implementation is also more or less straightforward. Evaluating new methods is not very hard per se, but what I am finding difficult is to come up with good experiments to show where this method is good, and when/why it is to be preferred over the alternatives. I never liked being a salesman, I thought people can be rational and able to objectively evaluate and choose the better method, and that if an idea is good it would be self-evident, without need of further proof. Wrong! You literally have to smash it in their face. Put less emphasis on these are all the cool things you can do with our new method, and focus on look at how much better you can do these things with our new method. It makes a big difference. Finally, writing the paper can take as much time as everything else, even without factoring the delay caused by peer review.
Procrastination: when you have so much independence, it is sometimes easy to keep postponing certain things you do not like, such as performing experiments just for the sake of proving how much better you are. Acting like this, in general, is not in my character, so I felt uncomfortable entering into this mindset, creating some delays in the project. The recipe to fight procrastination is easy: set clear deadlines, keep your peers and supervisors informed, and, most importantly, hold yourself accountable. The last part can be as simple as giving yourself prizes when things get done in time, and deny yourself pleasures when things are not going too well.

Finally, the action points to tackle these issues are:

Spend more time reading. Simple. Once the Deep Learning course is over, I plan to devote that time to systematically read books about the fundamentals on math, statistics and biology. This strategy was suggested by my supervisor, as opposed to participating to courses at university. I am explicitly encouraged to spend time learning and improving my skills, and I really appreciate this.
Find interesting people to collaborate with. Good people are more important than good projects. The goal is to learn from them. This is also simple on paper: just meet a lot of people and talk with them about our interests. It is not something that comes natural to me, but it is becoming easier.
Do another retrospective. Self-reflection is the only way to get better at things you struggle with.

Professional development

Before starting my PhD, I was keen on working on my soft skills, such as team work, leadership, and communication, while putting less emphasis on technical skills. I do not intend to stop learning new things, far from it, but now I have reached a point where acquiring new technical skills is just a matter of course: just like cooking dinner, it must be done, and it is more or less done in the same way every time.

In the past six months, I have been working a lot on communication: I started this blog (which, as far as I know, nobody reads), I teach at university (just a mere TA), I share my research with presentations, posters and scientific articles (nothing got published yet). Communication sure takes a lot of effort! The only feedback I received was about teaching, not only about the exercises but also about my presence in the classroom, which is, in general, good. I am helpful and cooperative, and I stimulate interest in the subject. There are mixed feelings about my ability to make difficult things understandable, which is something I value and should improve. I am looking forward to see the reviews of the article I submitted, I would be happy if there were no misunderstandings due to sloppy writing.

I have not achieved much in terms of collaborations, team work and leadership, the only relevant event being the hackathon in Dresden. It was definitely a positive experience, in the end, but I feel like my leadership was a bit lacking. I did have the respect of the others, but I made them waste some time on useless work. I do not consider myself good in dealing with stressful, high pressure situations, possibly because I lack the quick intuition, coming from experience, of what is the right thing to do. However, I am not too worried about my lack of collaborations. After all, this is a completely new environment for me, and I did not know anybody when I started.

Research diary

I try to keep a research diary, logging what I do every day, how I spend my time, and to take notes. This really helps me stay focused and in control of my time and energy. I use Org mode for Emacs (well, actually, Spacemacs), which can be seen as an extended Markdown, very well integrated into Emacs. In particular, its facilities to clock working time make time tracking effortless. Briefly, below every heading there is a list that contains dates and times of when I worked on that item. These intervals are maintained automatically by various commands in Emacs, so that I only need to clock in when I start working on something and clock out when I stop. For example, this is a snippet coming from my diary, where I removed unnecessary text:

* Teaching
** Deep Learning WS 19/20
*** Labs
**** Lab 7
     :LOGBOOK:
     CLOCK: [2019-12-20 Fri 09:09]--[2019-12-20 Fri 10:08] =>  0:59
     CLOCK: [2019-12-18 Wed 08:45]--[2019-12-18 Wed 09:46] =>  1:01
     CLOCK: [2019-12-17 Tue 13:02]--[2019-12-17 Tue 18:13] =>  5:11
     CLOCK: [2019-12-17 Tue 10:02]--[2019-12-17 Tue 11:43] =>  1:41
     :END:

Where section headers are specified with several * (the equivalent of Markdown’s #), and the LOGBOOK section is maintained by Emacs with the commands I mentioned above.

Emacs can automatically produce time reports based on the time clocking information. I assess my progress by looking at weekly/biweekly/monthly reports every Monday before the stand-up meeting with the rest of the group, where I do a brief, informal and personal stand-up. I also use the agenda to keep track of which items to work on, although my use of this functionality is still very basic.

The main sections in my research diary are:

Research: contains all the projects I work on, and every project contains information the main work items to be done: implementation, evaluation, the paper, presentations, posters, and so on. Each of these can contain further sub-topics, as needed.
Teaching: for now, this only contains the Deep Learning course which I am TA’ing. I keep track of the time needed to make the exercises for the students, to attend classes, as well as other administrative matters such as updating the website and helping with the slides (which are mostly done by other PhDs)
Learning: in most cases, I do not keep track of what exactly I learn, I only register the time I spend on learning. The reason is that I prefer to directly annotate the PDF of the papers I read (if I decide it is worth to read more than the abstract). I take notes when I attend seminars and journal clubs, and when I watch lectures on youtube.
Meetings: I have an item for every meeting, at least to clock the time. Often I take notes during the meeting itself, and for important meetings I also use this area to write my thoughts and outline a rough agenda before the meeting happens.
Events: this refers to conferences, retreats, and hackathons.
Administration: just to make sure I do not forget about paperwork.

Overall I am very satisfied about this diary, and I think my life would be considerably messier without it. Being text-based, unexpectedly, turns out to be a huge advantage: I can sort and re-order things however I please (I needed several iterations before finding a comfortable way of organizing things), and I am not limited by an arbitrary maximum depth for my items. Like many other things, it has a steeper learning curve compared to the mainstream dumbed-down solutions, but the benefits are well worth it.

Working time analysis

In this section I look at and comment on the time I spent on various activities. Below, I show a time report generated by Emacs, that I edited for clarity, removing irrelevant sections. I also added links to the relevant events or the organizations responsible, so that you know what I am talking about.

Headline	Time	%
Research	10d 9:49	40.0
Generalized EV design	6d 23:06	26.7
Paper	3d 2:52	12.0
Compare our approach with others	2d 14:19	10.0
Spacers	2d 16:16	10.3
Moonshots	11:16	1.8
Events	7d 22:47	30.5
MUDS Data Science Block Course	3d 1:16	11.7
Dresden Deep Learning Hackathon 2019	2d 2:18	8.0
CompStat Weekend 2019	1d 0:13	3.9
ICANN 2019	20:00	3.2
ICB Retreat 2019	20:00	3.2
Teaching	3d 13:49	13.7
Material	2d 9:49	9.2
Attendance	1d 0:49	4.0
Administration	3:11	0.5
Meetings	2d 13:15	9.8
Learning	1d 3:05	4.3
Administration	10:28	1.7
Total time	26d 1:13	100

In total, I clocked 625 hours and 13 minutes. The “canonical” working time would approximately be eight hours a day, five days a week, four weeks per month, for six months, i.e. 960 hours. This means that I accounted for 65% of the time I am supposed to be working, or an average of five hours per day instead of eight. This is consistent with my gut feeling, but the figure has to be taken with a grain of salt for a couple of reasons:

I did not start with this system straight away, especially at the beginning I was not clocking everything I did (for example, the paperwork needed to enroll to the university. It is not as simple as it sounds). I started clocking consistently three or four months ago, and even today there are some periods that do I not count. One of the most important examples is the time I spend wondering what I should be working on.
I am not working in an assembly line, and staying really focused and productive for eight hours every day is simply not feasible. Although it does not produce immediate gains, taking some time off to recap the work done, recent learnings, and the path ahead is much more important than pretending to be busy. I am not afraid to call it a day when I am stuck a few hours after lunch. Time away from the keyboard is essential to take a step back, look at the big picture, and find another way forward.

Now, what have I achieved in these 620 hours / six months?

I (almost) completed an entire project: the idea came from my supervisor, but I did the implementation and the evaluation, wrote half of the paper, etc. This is under the Generalized EV Design headline, which I covered in another blog post. This paper is currently (beginning of January) being reviewed, and it is very likely that more work will be needed before it gets published, but still. Note that the headline compare our approach with others includes a lot of changes and improvements performed after the main idea was implemented, simply because these comparisons brought up new and unexpected perspectives.
I am about one third of the way on another project (the Spacers), and this time I had the idea myself (I tend to discount the project above as I “simply” implemented it. Thank you impostor syndrome). The core of the project is already implemented and tested, what is missing is an evaluation and comparison with other approaches in the literature, as well as writing the paper.
The Deep Learning hackathon in Dresden resulted in another, now mostly complete, project that was presented at the Conference on Computing in High Energy & Nuclear Physics (CHEP 2019). During the hackathon I helped a team of physicists to use a graph convolutional neural network to classify particle decay events as “interesting” or “not interesting” (for their purposes at least). We are working on a paper about this. They did most of the job, such as preparing the data and a more thorough evaluation, I only helped them with the Deep Learning part (it was a five days event after all).
I created seven exercise sheets for the Deep Learning course. Each sheet contains three exercises, a mix of practical and theoretical ones. This was my first serious experience using the R programming language, I think I could have done the exercises in half the time if I was allowed to use Python. Many exercises were inspired to old material and to Bishop’s Neural networks for pattern recognition, which, although dated, is solid on the fundamentals, and I also proposed exercises that I found invaluable while I was learning the topic.
Other unquantifiable gains, such as knowledge and professional network.

Some comments on my time allocation are due:

The time I spent in meetings feels a lot more than 10%. The reason is that a badly scheduled meeting can disrupt a whole morning or afternoon. Just like every other technical worker, I dread meetings and I wish they could be kept to a minimum. I am also in a particular situation because I am part of two working groups, which means that I have more meetings than usual. Moving between different working places also takes considerable time; for example, every Thursday I would spend two and a half hours commuting, I had one meeting in the late morning and one in the early afternoon. Thursday was never a productive day.
Almost one third of time spent on events is way too much, although two thirds of this time was for mandatory events which could not be avoided. A lot of them took place in September and October, and indeed in those months I was feeling very unproductive (also because the Deep Learning course had just started). The MUDS course lasted two whole weeks, and was mostly a waste of time.
Only four percent of my time was spent on learning! This is embarassingly low, but I needed several months to find good sources of relevant research. I have been reading systematically only in the past few months, dealing with 3 to 5 articles in 20 to 40 minutes every day. Also, I often read while commuting (Janeway’s Immunobiology), although this time is never clocked in. Once the Deep Learning course is over, I plan to use that time to seriously read more books, starting from Strang’s Linear algebra and learning from data.
58 hours for 7 exercise sheets makes slightly more than eight hours per sheet, plus two more hours for my physical presence in the classroom means around ten ours per week. It is slightly more than what I am supposed to spend on it (i.e., eight hours), but the next labs will be quicker to do as my students complained that the exercises so far are useful and stimulating, but too long and complicated (lazy youngsters…). Also note that I am attending regular lessons to align what is done in the labs and what is said in the lecture. I have not studied at this university nor in this country (and, consequently, I did not know how good a professor my supervisor is), so I think this time is well spent.

To sum up, although I do not always work the “canonical” eight hours per day, I made good progress in my research. The exercises I made for Deep Learning are high quality and well received, except for their length, both by my students and by my supervisor. I should spend more time learning, which will happen once that course is done. On paper, I do not spend much time in meetings, but in practice it feels much more. I also had to participate to several events, some of them more useful than other.