On March 25, 2014, Microsoft released the source code for Microsoft Word for Windows 1.1a. They said they released it “to help future generations of technologists better understand the roots of personal computing.

I thought it would be interesting to perform an automated code review on it using CheckMarx, to see how they did related to security. The source consisted mainly of C++ code (376,545 lines of code) as well as code written in assembler. The assembler code was not scanned because CheckMarx (or any other automated code scanners) does not support assembler. What came out of the tool was interesting.

CheckMarx indicated that the risk in the code is:

Sk Wayback

The distribution of risk from Informational to High:

Sk Wayback

You have to remember that this code is from the 1980s. Many people did not have a concept of secure code and the development tools did not address security at all.

The top five vulnerabilities are as follows:

Sk Wayback

From the code that I looked at, most of the issues come from the use of unsafe functions. For example:

	if (!strcmp(szClass, "BEGDATA")) 		strcpy(szNameSeg, "Data"); 	else 		strcpy(szNameSeg, szName); 	nSegCur = nSeg;

The function strcpy has been replaced by a safe function strncpy. The function strncpy combats buffer overflow by requiring you to put a length in it. The function strncpy did not exist in the 1980s. The code also contains 123 instances of the goto statement. For example:

LError:  		cmdRet = cmdError; 		goto LRet; 		} 	pdod = PdodDoc(doc);

From the MSDN web site, Microsoft states, “It is good programming style to use the break, continue, and return statements instead of the goto statement whenever possible. However, because the break statement exits from only one level of a loop, you might have to use a goto statement to exit a deeply nested loop.” I am not sure of the C++ syntax back in the 1980s, but maybe break, continue, and return statements did not exist.

You can get a copy of the code for both MS Word and MS-DOS from here: https://www.computerhistory.org/press/ms-source-code.html. Just remember there now are better ways to write code.

Below is the complete list of issues found in the code:

Vulnerability TypeOccurrencesSeverity
Buffer Overflow unbounded180High
Buffer Overflow StrcpyStrcat22High
Format String Attack18High
Buffer Overflow OutOfBound12High
Buffer Overflow cpycat3High
Use of Uninitialized Pointer135Medium
Dangerous Functions58Medium
Use of Uninitialized Variable41Medium
Char Overflow35Medium
Stored Format String Attack19Medium
Stored Buffer Overflow cpycat11Medium
MemoryFree on StackVariable4Medium
Short Overflow2Medium
Integer Overflow1Medium
Memory Leak1Medium
NULL Pointer Dereference341Low
Potential Path Traversal24Low
Unchecked Array Index18Low
Unchecked Return Value11Low
Potential Off by One Error in Loops6Low
Use of Insufficiently Random Values3Low
Potential Precision Problem2Low
Size of Pointer Argument1Low
Methods Without ReturnType500Information
Unused Variable310Information
GOTO Statement132Information
Empty Methods9Information
Potential Off by One Error in Loops6Information

This code is a good example of what not to do.

Programming languages and tools have evolved to make your application much more secure, but only if you teach your developers the concepts of secure coding.