M7 Script: Design Philosophy

Why Does M7 Use a Non-Standard Approach?

M7 was designed with a hands-on learning approach, where the goal was to retrace the footsteps of pioneers in the industry rather than rely on existing frameworks. This ensures a deep understanding of language construction rather than just using pre-built solutions. While some choices may seem unconventional, they were made to gain insight into the full process of language development.

By developing M7 from the ground up:

  • I wanted to prove to myself that I could write this without reading someone else's book or code, and I didn’t want to pollute my own experience and code by leaning on someone else’s work.
  • Every component, from parsing to execution, is understood and built without reliance on external frameworks.
  • The project serves as a learning experience while producing a functional and extensible scripting language.
  • Prototyping in languages other than C directly might be considered "cheating" since early compiler engineers didn't have those luxuries. Then again, my first compiler was written in C++, and I suspect the original pioneers had generous university grants or other compensation for their suffering—something I definitely don't have! :)

While this approach may result in non-standard implementations, it is a deliberate choice to understand and refine every stage of language development.


Why Was M7 Not Directly Developed in C?

Although I have a strong background in C programming, I chose to write M7 in PHP for several reasons:

  1. Experimenting with a New Environment

    • While I wanted to build M7 from the ground up, this is not my first compiler.
    • However, it was my first time writing one in PHP, and I thought that would be a fun challenge.
  2. Focusing on the Algorithm Instead of the Language

    • Writing M7 in PHP allowed me to focus on developing the core algorithm rather than dealing with the low-level details of C.
    • A significant amount of time can be saved by avoiding manual memory management, type safety issues, and other C-specific concerns.
    • Ironically, C is an ideal language for writing compilers, but certain operations—particularly working with references—were actually harder in PHP than working with pointers in C.
  3. Heap and Garbage Collection Experimentation

    • I wanted to work directly on heap management and garbage collection, but in an environment where I could focus entirely on the heap itself.
    • In C, I would have had to juggle both heap management and language intricacies at the same time.
    • PHP allowed me to isolate heap behavior without interference from low-level concerns like memory alignment or manual allocation.
  4. Rapid Debugging and Diagnostic Tools

    • I wanted powerful debugging tools readily available.
    • When developing a new system, you don’t always know what you’re looking for, and I didn’t want to waste weeks writing and rewriting debugging tools from scratch.
    • C would have made debugging significantly more difficult and slowed down early-stage development.

Footnote on PHP

PHP has a very straightforward, no-nonsense syntax. It has annoyingly long function names and positional function argument orders I disagree with, but overall, it is very straightforward. It is also obscenely fast for an interpreted language, making it ideal for prototyping.

Since I only used classes as a means of organizing libraries of static functions, much of the typical annoyance in PHP was avoided.

I also made a conscious effort to avoid using my personal libraries to ensure greater portability and because I won’t have many of those ease-of-use tools available to me in C.


Transition to C

  • The transition to C will commence once the final language constructs are completed, the codebase is properly organized for the interpreter, and the parser slowness is resolved.
  • I do not believe the slowness is PHP-related, but rather due to lazy design choices—the parser was completed in under a week and needs restructuring.

While M7 is currently written in PHP, it is designed to eventually be rewritten in C once the core logic is fully refined. The interpreter has been structured in a way that allows for direct translation into C, ensuring a smooth transition without major rewrites. Classes in the PHP implementation are used solely as a means to represent structs, and there are no class methods in the interpreter at all, except for the ui wrappers. This keeps the design modular and lightweight while keeping future C conversion straightforward. This approach ensures that algorithmic correctness is prioritized first, allowing a later transition to a lower-level language when performance optimization becomes necessary.


C is for Losers. Why Not Assembly?

I considered it. But I also considered keeping my sanity.

Writing a compiler in assembly would have been the ultimate exercise in pain, suffering, and lost sleep. While it would be an interesting challenge, M7 is meant to be functional, not an experiment in self-torture.

That said, nothing is stopping someone truly dedicated (or completely unhinged) from writing an M7 implementation in pure assembly—but it won't be me!


Lessons Learned

  • I learned how to write a garbage collector and gained a direct understanding of how compilation works.
  • I don’t think I would have truly understood compilation at this level had I just read a tutorial.
  • Typically, when I rewrite a project from the ground up, I make better design decisions, but this time, some choices felt less efficient than my previous attempts.

Comparisons to Previous Compiler Projects

  1. Handling Function Returns

    • When I wrote a compiler in JavaScript and Perl, I found it easier to handle function returns, as my prior revisions handled this aspect more concisely.
  2. Performance of Parsing

    • My previous compilers had integrated tokenization and parsing, which made parsing significantly faster.
    • Since I tokenized parentheses, comments, blocks, and braces into a hierarchical structure during tokenization, these elements became substreams within the main token stream. This eliminated the need for recursive tracing of expressions during the parse stage, simplifying the parser's workload while preserving structural integrity.
    • The M7 parser is highly general-purpose, which prevents the kind of shortcuts I would have normally taken when parsing structured text.
    • There’s a trade-off between general-purpose functionality and efficiency.
  3. First Compiler Experience (C++ Trading Platform)

    • The first compiler I wrote was for a trading platform in C++.
    • I didn’t want to continuously write and compile C++ code just to test logic.
    • The final output resembled assembly but with hashes and classes.
    • C++’s class-based design actually made execution highly efficient, as each operation could be mapped to a class, allowing operator overloading for routing execution efficiently.
  4. Ease of Writing in Different Languages

    • Perl was by far the easiest language to write a parser and interpreter in.
    • JavaScript was also pleasant, though slightly less so than Perl.
    • PHP was useful for parsing, but I found myself fighting the language in the interpreter.

What Would I Do Differently?

  • Perl translates to C more directly than other high-level languages (if written carefully), despite requiring a moderate amount of arcane knowledge about Perl's syntax.
  • However, I will not be rewriting M7 in Perl—it is going straight to C after this version.

Was AI Involved in Your Development Process?

Yes, AI helped me write these documents—it does a far better job at it than I can.

AI was also consulted for proper terminology and general computing concepts, as I lack formal training in compilation.

The code itself was written by me, with occasional help for syntax-related questions or to 'beautify' comments (I have a foul mouth). AI also assisted in writing some very pretty debugging tools that saved me a lot of time.