Mistakes can be costly. There are several types of mistakes that I have made in my research, some of them are not easy to identify, some of them once identified take serious amount of time to fix, some of them when fixed evolve into literal nightmares that haunt me at night, and perhaps, some of them still hide somewhere that are yet to be exposed. Here I summarize my mistakes in the past six years of research into six different types, to remind myself not to repeat similar mistakes that I have made. Hopefully, it would also provide something to people who read it.
Type A: logic flaws
Here I assume that observations are analytical results/outputs from some coded scripts. So I discuss coding and observation together. The most dangerous ascertainment bias in coding/scripting is that when the observations agree with expectations, it is actually due to a bug in the code. More often, when the observations look weird, the code would be revisited until things look highly likely. This is super dangerous. It could be more dangerous for small tasks which only need to be done once or twice than for bigger tasks which would be applied to data repeatedly. I find it helpful to do some testing even if the first time everything runs smoothly. Another thing I try to do is to avoid having any prediction and expectation, at least when I first analyze something. I also try to slow down and stay focus when I code for very simple things and only code when I desire to code, and I find it helpful.
Observations could be more general. It could be experimental results, which I have little familiarity. Ascertainment bias in observations could also be about acquiring knowledge from the literature, for example, if there is an argument in literature and a person reads more about one argument than the counter argument, the person may as well miss some valuable evidence on the other side. This could be similar to the bias in learning and understanding in Type A, but it is not due to logic but due to biased exposure/observation of the literature. This process could be entirely unintentional, or it could be subconscious. Another scenario could be similar to Type F, not observing existing literature on the topic of study could turn out to be quite a disaster.
Type D: missed important information from data.
This is right now my most painful experience in mistakes, and fortunately so far it occurred once. This is a scary mistake, not only because all the analyses need to be redone, but also because the previous observations may no longer hold, as well as all interpretations on top of them. Fortunately, in my case, the main result stayed the same. There can be at least two types of important information missed, especially when using public data.
The worst thing about pride is that it triggers me oversee my own mistakes. I struggle to eliminate the inner ego of myself, because otherwise, I could assess my questions and methods ignorantly. I know that I learnt this in the hard way. There was once, I ignored the first person's question on whether my test was unbiased, and did not try to prove it with a thorough simulation, and only found it was indeed biased when another person also instanced that the same. This experience makes me realize that I have to constantly remind myself to question my own judgement before other people question it, and I should definitely question my judgement if someone questions it. If something could be proven by a simulation or a mathematical proof, it might well worth the time.
Type F: reinventing the wheel.
Reinventing the wheel could waste a lot of time, and could also detriment the novelty of a project. It first occurred in the first project during my Ph.D., a method in detecting selection of overlapping genes. I didn't manage to find in literature that the wheel has already been invented until I was revising the first draft of the manuscript. Despite the two methods are quite different, and despite I eventually manage to publish mine as well, I have to spend a lot extras time to justify the new method, including scan for overlapping genes in the human genome, get examples, compare the speed and accuracy of the two methods, and etc. This type of mistake could be somewhat easily avoided by a more careful investigation of the literature. Nowadays, there are so many journals and so many papers, which does make it harder and harder to keep up with the literature. I adapt the following tactics to partially avoid this mistake: figure out all possible alternative terminologies and relevant concepts, go through the reference list of all key relevant papers, go through all papers that cited those papers, go through the work of relevant researchers, and if possible, ask someone who is more senior and knowledgeable to assess the topic.
Most people would not make as many mistakes as I do, or may not have made as many types of mistakes. Very often mistakes are inevitable, but could be fixed at an early stage to avoid future damage. Learning from the past, I found triple-checking being helpful, along with patience and keep calm when witnessing an exciting result.
Mistakes are not terrible, what is terrible is leaving a mistake uncorrected. It occurs to me that it is perhaps most important to have the mindset of knowing, that mistakes are made unintentionally; therefore, intentionally and constantly triple-check every possible step and correct them at the early stage could be more efficient than rushing for a quick result.
April 's blog