The Way Back Machine – Microsoft Word for Windows 1.1a
On March 25, 2014, Microsoft released the source code for Microsoft Word for Windows 1.1a. They said they released it “to help future generations of technologists better understand the roots of personal computing.“
I thought it would be interesting to perform an automated code review on it using CheckMarx, to see how they did related to security. The source consisted mainly of C++ code (376,545 lines of code) as well as code written in assembler. The assembler code was not scanned because CheckMarx (or any other automated code scanners) does not support assembler. What came out of the tool was interesting.
CheckMarx indicated that the risk in the code is:
The distribution of risk from Informational to High:
You have to remember that this code is from the 1980s. Many people did not have a concept of secure code and the development tools did not address security at all.
The top five vulnerabilities are as follows:
From the code that I looked at, most of the issues come from the use of unsafe functions. For example:
if (!strcmp(szClass, "BEGDATA")) strcpy(szNameSeg, "Data"); else strcpy(szNameSeg, szName); nSegCur = nSeg;
The function strcpy has been replaced by a safe function strncpy. The function strncpy combats buffer overflow by requiring you to put a length in it. The function strncpy did not exist in the 1980s. The code also contains 123 instances of the goto statement. For example:
LError: cmdRet = cmdError; goto LRet; } pdod = PdodDoc(doc);
From the MSDN web site, Microsoft states, “It is good programming style to use the break, continue, and return statements instead of the goto statement whenever possible. However, because the break statement exits from only one level of a loop, you might have to use a goto statement to exit a deeply nested loop.” I am not sure of the C++ syntax back in the 1980s, but maybe break, continue, and return statements did not exist.
You can get a copy of the code for both MS Word and MS-DOS from here: https://www.computerhistory.org/press/ms-source-code.html. Just remember there now are better ways to write code.
Below is the complete list of issues found in the code:
Vulnerability Type | Occurrences | Severity |
---|---|---|
Buffer Overflow unbounded | 180 | High |
Buffer Overflow StrcpyStrcat | 22 | High |
Format String Attack | 18 | High |
Buffer Overflow OutOfBound | 12 | High |
Buffer Overflow cpycat | 3 | High |
Use of Uninitialized Pointer | 135 | Medium |
Dangerous Functions | 58 | Medium |
Use of Uninitialized Variable | 41 | Medium |
Char Overflow | 35 | Medium |
Stored Format String Attack | 19 | Medium |
Stored Buffer Overflow cpycat | 11 | Medium |
MemoryFree on StackVariable | 4 | Medium |
Short Overflow | 2 | Medium |
Integer Overflow | 1 | Medium |
Memory Leak | 1 | Medium |
NULL Pointer Dereference | 341 | Low |
Potential Path Traversal | 24 | Low |
Unchecked Array Index | 18 | Low |
Unchecked Return Value | 11 | Low |
Potential Off by One Error in Loops | 6 | Low |
Use of Insufficiently Random Values | 3 | Low |
Potential Precision Problem | 2 | Low |
Size of Pointer Argument | 1 | Low |
Methods Without ReturnType | 500 | Information |
Unused Variable | 310 | Information |
GOTO Statement | 132 | Information |
Empty Methods | 9 | Information |
Potential Off by One Error in Loops | 6 | Information |
This code is a good example of what not to do.
Programming languages and tools have evolved to make your application much more secure, but only if you teach your developers the concepts of secure coding.
Explore more blog posts
Exploiting Second Order SQL Injection with Stored Procedures
Learn how to detect and exploit second-order SQL injection vulnerabilities using Out-of-Band (OOB) techniques, including leveraging DNS requests for data extraction.
CTEM Defined: The Fundamentals of Continuous Threat Exposure Management
Learn how continuous threat exposure management (CTEM) boosts cybersecurity with proactive strategies to assess, manage, and reduce risks.
Balancing Security and Usability of Large Language Models: An LLM Benchmarking Framework
Explore the integration of Large Language Models (LLMs) in critical systems and the balance between security and usability with a new LLM benchmarking framework.