I thought it might be funny and useful (as we are all learning from mistakes) to share our biggest (or fairly large) mistakes in coding. I made (too) many in my 40 years coding career, but I will start with the 2 recent ones (from this week):
Programmed a very long-running routine, went for a walk as I knew it will run for at least one hour and noticed upon return that I forgot to delete the “LIMIT 100” in my SQL statement before putting it into production. (Lessons learned: always(!) use use Pragmas with a check for DebugBuild).
Optimized another long running Programm to use workers (for the first time). Cool stuff and I succeeded. While I uploaded the app to my customer’s Linux server it came to my mind to check on how many cores it is running. Guess the answer: ONE! . (Lessons learned: life is not fair )
OMG! That’s a nightmare! Remembers a mistake I made once. I had backuped everything (well I thought I had). The code was on GitHub, the code was zipped several times. I planned to have a nap but before falling asleep it came to my mind that DELETE * FROM XY is bad as Postgres has the truncate statement. So in a last action I wanted to change that. It worked perfectly fine.
Only issue: I changed the table name too. Don’t ask me why … So I didn’t delete the content from my “Settings”-Table but the “main” table. Did I had a dump? Oh wait, no! No problem, the whole server is backed up on a daily base. Oh wait, not this one … I was lucky that it happened at the beginning of my project. I’m usually starting with an SQLlite database before I move to a Postgres DB and I had an SQLite covering 99,5 percent of the data … (Lessons learned: if you are tired and you have a brilliant idea, write it down, or write yourself a short email etc, but NEVER re-start to code if you are exhausted.)
Not a coding error, but a programmer error for sure. I wanted to remove some .log files and wrote “rm * .log”. Yep, that little extra space in there screwed the pooch. I was logged in as admin. At root. On the client’s machine. Spent the night frantically undoing the damage.
I topped that one in the late Nineties. Added even a “-rf” … as well as root-user and IN the root … impressive that the system kept running, well the booting was less successful (Lessons learned, don’t use the root user EVER! and take space and much more seriously on Lnux … and NEVER copy tips from the Internet into the terminal without having understood them! )
Not really a coding error, but I wiped a soon-to-be-production server. This was a server that was spun up just to do some experimenting on, so I didn’t setup the normal battery of failsafes like daily snapshots. I had code on it that was all proof of concept, so nothing checked into version control. It was made clear to me at the start that this was all just for a proof of concept, nothing that was intended to be kept.
Well then I’m told, yep, we’re going to keep this server, something I really didn’t want to do. Had I known, I would have coded differently, besides just setting up all the backups and version control and all that. So I start going to add the missing bits. One of the simple tasks is to change the hostname, since it was clearly marked as a dev server. This takes seconds to do on the command line.
Except I was in Vultr’s control panel changing the reverse dns there. So I might as well just change the hostname while I’m there right? It has a place for that. I do that, it gives me a warning that I didn’t fully read because of course I want to change the hostname. Yeah… turns out that wipes the server. Logically, it makes sense, Vultr can’t access the server to change it so their only option is to do that at setup time.
Read the warnings.
In the end it could have been worse. I only nuked about a week’s worth of work, and replacing it only took about two days since the core problems had already been solved. It came back stronger as a result, but this was definitely the wrong way to go about it.
Not really a programming error, but on the order of “rm -rf” as root.
It’s 1980 and I’m employed as a student worker by the CS department at my university. I am responsible for the care and feeding of one Data General Nova and a DEC 11/23. The Nova had a disk drive that had a fixed platter and a removable platter. We had the OS and user files on the fixed drive and the removable pack was used for backups, temp files, and that sort of thing.
We got a new pack in for the drive and it needed to be formatted before use. I don’t recall the exact syntax of the command, but I recall it took the starting and ending surface numbers as arguments instead of however the OS distinguished between the two. You probably see where this is going.
Turns out that the fixed platter came first, surfaces 0-1, and the removable was 2-3. Of course, I assumed it was the other way (after all, the removable was on top, so shouldn’t it come first?). OS trashed, lots of user files destroyed including about a week of hard work by one of the professors on his pet database project.
Once again it is now proven that the heart symbol as only possibility to give feedback is not always appropriate in this forum. Well, take it as showing my empathy to this tragedy.
Talking about student time: I worked as a medical student in the IT department of the university hospital. My boss was a physician. My other student fellow and I send him on a Windows NT 3.51 network messages via:
net send HISUSERNAME "Ready for a coffee break?"
We had a lot of fun sending him such silly popups. One day he asked how we are achieving this “miracle”. So we explained it to him. Funny side fact: IT department in hospitals only started to be created in Germany and he had as well the function to be the data protection officer, everyone did know who he was and that he was accountable that typewriters were replaced by these strange computers.
Cutting a long story short. Several weeks later my colleague and I were shocked seeing a popup on our screens, and in parallel “beeps” over all our 6 PCs in the office. Obviously this message was spread to the whole network of the university hospital. The message said:
WORK HARDER! BIG BROTHER IS WATCHING YOU!
So my boss was smart enough to remember the net send command and the right spelling. But he didn’t know my colleague’s and my username in Windows. Being a smart guy, he thought: oh well, let’s try the * Asterix. This will safe me time, I want anyway to send the same message to both gentlemen :-).
He again spoke to us, but there was a few weeks of silence
My own horror story with the “rm” command… working on a Sun Sparcstation way back in the day, accidentally did “rm *” in the wrong directory, and, of course, I was in the shared library directory and the whole thing came down immediately. Sparcs used dynamic libraries extensively so not even a simple “ls” command would work. Fortunately, the tar command was not dynamically linked and I was able to restore from a zero level dump that I just done two days before that so minimal damage was done. Lesson: always check what directory you are in before using “rm”!
So the Valentina folks thought it a good idea to use UTF16 as default encoding when the encoding of a string was nil for a minor update. Which resulted in chinese characters for the raw data of my app. I noticed this accidentally between Christmas and New Year. The only time I had cold flashes after noticing a problem. But because of the timing only a few users had installed the update. I only had to fix the data for a couple of users.
A couple of years ago I had something like < instead of <= checking a timestamp for SMS alert messages. It was supposed to filter messages that were already sent.
I was lucky it was my test system that routed all messages to my phone instead of the production system. Around 2000 messages (and $20) later I could finally use my phone again.
Lucky you: (Lessons learned: never put some into production and go for a walk, on holidays, etc. and don’t do it at night (impossible though if your product is used in multiple timezones), if you don’t want to have a bad wake up call …)