This commit is contained in:
2026-02-05 12:46:36 +02:00
parent 660b9245fb
commit 96f8aa3bbb
13 changed files with 387 additions and 17 deletions

174
docs/main.md Normal file
View File

@@ -0,0 +1,174 @@
## Quick warning
This language is still in its pre alpha stages, while it tends to work it still has numerous bugs, breaks, and fails with a smell of unusual input, it is not suitable for anything beyond experimenting and having fun. That said this project has been my biggest one yet, and is one of my favorites, having went under multiple rewrites over the 4 or so years its changed a lot from simple forth style syntax, to now, a c/rust style, which has been a struggle, but now its at a point where it actually produces code and you can *kinda* make something with it.
Hope you have fun figuring out how to use this damn thing, and maybe even build something with it.
Also, the docs are also written with ChatGPT (gross, i know), but i needed some docs fast.
\- MCorange
## Summary
This language is a small, low level, C and Rust inspired systems language with a strong focus on explicit control, simple syntax, and predictable behavior. It mixes familiar C style statements and loops with Rust-like ideas such as structs, methods, references, and explicit types. The goal is clarity over magic: most things are spelled out, lifetimes are implied by references, and there is little hidden behavior.
The language is suitable for experiments, embedded style programs, or as a foundation for a custom runtime or OS environment. It favors straightforward semantics, manual control flow, and a minimal standard library model.
## High level description
As a procedural language, barely higher level than C, programs are built from functions, structs, and constants. Types are explicit, references are written by prepending `&`, and arrays or slices are expressed directly in the type system. For now code is organized with `include`, for simpler development, this will change into modules eventually.
Control flow is traditional and imperative. There are `if`, `else`, `for`, `while`, and `loop` constructs, along with `break` and `continue`. Functions return values explicitly using `return`, and side effects are visible and direct.
Methods are just functions namespaced under a type, and there is no hidden constructor logic. Everything that allocates or returns a reference does so explicitly.
## Small language reference
## Types
Type aliases:
```no-test
type cstr = [u8];
```
Creates an alias. Here, `str` is defined as a byte array type.
Built-in simple types:
* usize/isize
* i8-i64
* u8-u64
* bool
* array or slice types like [u8]
References are written with `&T`.
## Includes
```no-test
include "std.mcl";
```
Textual inclusion of another source file. This is resolved at compile time.
## Structs
```no-test
struct Foo {
a: usize,
b: &str
}
```
Structs group named fields. Fields have explicit types. There are no implicit constructors.
## Functions
Function definition syntax:
```no-test
fn name(arg: Type, ...) -> ReturnType {
...
}
```
Example:
```no-test
fn mul(n: usize, n2: usize) -> usize {
return n * n2;
}
```
Functions may return values using `return`. Void functions omit the return type.
## Methods
Methods are functions scoped to a type:
As of now methods are only able to be linked to [Structs](#Structs)
```no-test
fn Foo.new(a1: usize, b: &str) -> &Foo {
return &Foo {
a: a1,
b: b
};
}
```
This defines a constructor-like function for `Foo`. It explicitly returns a reference to a `Foo` value.
## Variables
Variables are declared with `let`:
```no-test
let obj = Foo::new(1, "owo");
```
Type inference appears to exist, but types can likely be written explicitly if desired.
## Field access
Struct fields can be accessed through references:
```no-test
obj->b;
```
As well as without dereferencing:
```no-test
obj.c
```
## Control flow
Infinite loop:
```no-test
loop {
...
}
```
While loop:
```no-test
while (true) {
...
}
```
For loop:
```no-test
for (let i = 0; i < 10; i += 1) {
...
}
```
Conditionals:
```no-test
if (i > 7) {
break;
} else {
continue;
}
```
## Constants
```no-test
const FOO: usize = main;
```
## Statics
```no-test
static FOO: usize = main;
```

View File

@@ -1,3 +1,6 @@
#![doc = include_str!("../docs/main.md")]
pub mod common;
pub mod tokeniser;
pub mod parser;

View File

@@ -1,4 +1,6 @@
use std::{path::PathBuf, process::ExitCode};
#![doc = include_str!("../docs/main.md")]
use std::path::PathBuf;
use clap::Parser;
// Importing logger here too cause the logger macros dont work outside the mclanc lib

View File

@@ -5,9 +5,9 @@ use ast::{expr::Block, Ast, Program};
use crate::{cli::CliArgs, tokeniser::{Token, tokentype::*}};
pub mod ast;
mod expr;
mod stat;
mod utils;
pub mod expr;
pub mod stat;
pub mod utils;
pub mod typ;
type Result<T> = anyhow::Result<T>;

View File

@@ -1,10 +1,8 @@
use std::str;
use anyhow::bail;
use crate::cli::CliArgs;
use crate::common::loc::LocBox;
use crate::{cli, error, lerror};
use crate::{error, lerror};
use crate::parser::ast::{Program, TString, TokenType};
use crate::parser::ast::statement::Let;
use crate::parser::expr::parse_expr;
@@ -48,7 +46,7 @@ pub fn parse_statement(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Progra
}
}
fn parse_include(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
pub fn parse_include(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Include), "")?;
let TokenType::String(include_path) = utils::check_consume_or_err(tokens, TokenType::String(TString::default()), "")?.tt().clone() else {panic!()};
_ = utils::check_consume_or_err(tokens, TokenType::Punct(Punctuation::Semi), "")?;
@@ -81,7 +79,7 @@ fn parse_include(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) ->
Ok(LocBox::new(kw.loc(), Statement::Include))
}
fn parse_enum(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
pub fn parse_enum(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Enum), "")?;
let name = utils::check_consume_or_err(tokens, TokenType::ident(""), "")?.tt().unwrap_ident();
_ = utils::check_consume(tokens, TokenType::Delim(Delimiter::CurlyL));
@@ -105,7 +103,7 @@ fn parse_enum(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
Ok(LocBox::new(kw.loc(), Statement::Enum(Enum { name, fields })))
}
fn parse_struct(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
pub fn parse_struct(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Struct), "")?;
let name = utils::check_consume_or_err(tokens, TokenType::ident(""), "")?.tt().unwrap_ident();
_ = utils::check_consume(tokens, TokenType::Delim(Delimiter::CurlyL));
@@ -131,7 +129,7 @@ fn parse_struct(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
Ok(LocBox::new(kw.loc(), Statement::Struct(Struct { name, fields })))
}
fn parse_static(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
pub fn parse_static(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Static), "")?;
let name = utils::check_consume_or_err(tokens, TokenType::ident(""), "")?.tt().unwrap_ident();
@@ -146,7 +144,7 @@ fn parse_static(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> R
Ok(LocBox::new(kw.loc(), Statement::StaticVar(StaticVar { name, typ, val })))
}
fn parse_let(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
pub fn parse_let(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Let), "")?;
let name = utils::check_consume_or_err(tokens, TokenType::ident(""), "")?.tt().unwrap_ident();
let mut typ = None;
@@ -164,7 +162,7 @@ fn parse_let(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Resu
_ = utils::check_consume_or_err(tokens, TokenType::Punct(Punctuation::Semi), "")?;
Ok(LocBox::new(kw.loc(), Statement::Let(Let{ name, typ, val })))
}
fn parse_constant(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
pub fn parse_constant(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Const), "")?;
if let Some(_) = utils::check(tokens, TokenType::Keyword(Keyword::Fn)) {
@@ -182,7 +180,7 @@ fn parse_constant(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) ->
Ok(LocBox::new(kw.loc(), Statement::ConstVar(ConstVar { name, typ, val })))
}
fn parse_type_alias(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
pub fn parse_type_alias(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Type), "")?;
let name = utils::check_consume_or_err(tokens, TokenType::ident(""), "")?.tt().unwrap_ident();
_ = utils::check_consume_or_err(tokens, TokenType::Punct(Punctuation::Eq), "")?;
@@ -192,7 +190,7 @@ fn parse_type_alias(tokens: &mut Vec<Token>) -> Result<LocBox<Statement>> {
Ok(LocBox::new(kw.loc(), Statement::TypeAlias(TypeAlias { name, typ })))
}
fn parse_fn(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
pub fn parse_fn(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Result<LocBox<Statement>> {
error!("fnc");
// Just remove the kw since we checked it before
let kw = utils::check_consume_or_err(tokens, TokenType::Keyword(Keyword::Fn), "")?;
@@ -241,7 +239,7 @@ fn parse_fn(tokens: &mut Vec<Token>, cli: &CliArgs, prog: &mut Program) -> Resul
// usize is: 0 = no self, static; 1 = self, ref; 2 = self, mut ref
fn parse_fn_params(tokens: &mut Vec<Token>) -> Result<(usize, Vec<(Ident, LocBox<Type>)>)> {
pub fn parse_fn_params(tokens: &mut Vec<Token>) -> Result<(usize, Vec<(Ident, LocBox<Type>)>)> {
let mut args = Vec::new();
utils::check_consume_or_err(tokens, TokenType::Delim(Delimiter::ParenL), "")?;

View File

@@ -173,7 +173,7 @@ impl TokenType {
_ => panic!("Expected {}, got {self}", Self::ident(""))
}
}
pub fn ident(s: &str) -> Self {
pub fn ident(s: impl ToString) -> Self {
Self::Ident(Ident(s.to_string()))
}
pub fn number(val: usize, base: u8, signed: bool) -> Self {

BIN
test

Binary file not shown.

BIN
test.o

Binary file not shown.

View File

View File

@@ -0,0 +1,52 @@
use mclangc::{
cli::CliArgs,
common::{Loc, loc::LocBox},
parser::{self, ast::{
Ident, Keyword, Number, Punctuation, TokenType, expr::Expr, literal::Literal,
statement::{Statement, ConstVar},
}},
tokeniser::Token, validator::predefined::BuiltinType
};
#[test]
fn test_parse_const_stat() {
let mut tokens = vec![
Token::new_test(TokenType::Keyword(Keyword::Const)),
Token::new_test(TokenType::ident("MyConstant")),
Token::new_test(TokenType::Punct(Punctuation::Colon)),
Token::new_test(TokenType::ident("usize")),
Token::new_test(TokenType::Punct(Punctuation::Eq)),
Token::new_test(TokenType::number(69, 10, false)),
Token::new_test(TokenType::Punct(Punctuation::Semi)),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
];
tokens.reverse();
let res = parser::stat::parse_statement(&mut tokens, &CliArgs::default(), &mut super::get_prog());
assert!(res.is_ok(), "{res:?}");
match res {
Ok(res) => {
assert!(res.is_some(), "{res:?}");
match res {
Some(res) => {
assert_eq!(res.inner().clone(), Statement::ConstVar(ConstVar {
name: Ident::new("MyConstant"),
typ: LocBox::new(&Loc::default(), BuiltinType::usize()),
val: LocBox::new(&Loc::default(), Expr::Literal(String::new(), Literal::Number(Number { val: 69, base: 10, signed: false })))
}));
}
None => unreachable!()
}
}
Err(_) => unreachable!()
}
assert!(tokens.len() == 3, "leftover token count incorrect");
}

45
tests/parser/stat/mod.rs Normal file
View File

@@ -0,0 +1,45 @@
use mclangc::{
common::{Loc, loc::LocBox},
parser::{
ast::{
Ident, Program,
statement::{Enum, Struct},
typ::Type
}
},
validator::predefined::{BuiltinType, load_builtin}
};
mod constant;
mod statc;
mod type_alias;
pub fn get_prog() -> Program {
let mut prog = Program::default();
load_builtin(&mut prog);
let loc = Loc::default();
prog.structs.insert(Ident::new("MyStruct"), LocBox::new(&loc, Struct {
name: Ident::new("MyStruct"),
fields: vec![
(Ident::new("foo"), LocBox::new(&loc, BuiltinType::usize())),
(Ident::new("bar"), LocBox::new(&loc, BuiltinType::bool())),
(Ident::new("baz"), LocBox::new(&loc, Type::Owned(Ident::new("str")).as_ref())),
]
}));
prog.enums.insert(Ident::new("MyEnum"), LocBox::new(&loc, Enum {
name: Ident::new("MyEnum"),
fields: vec![],
}));
prog.types.insert(Ident::new("MyType"), LocBox::new(&loc, Type::Owned(Ident::new("MyStruct"))));
prog.types.insert(Ident::new("MyType2"), LocBox::new(&loc, Type::Owned(Ident::new("MyType"))));
prog.types.insert(Ident::new("MyType3"), LocBox::new(&loc, Type::Owned(Ident::new("MyType2")).as_ref()));
prog.types.insert(Ident::new("MyTypeInf"), LocBox::new(&loc, Type::Owned(Ident::new("MyTypeInf"))));
prog.types.insert(Ident::new("MyTypeInfIndirect"), LocBox::new(&loc, Type::Owned(Ident::new("MyType4"))));
prog.types.insert(Ident::new("MyType4"), LocBox::new(&loc, Type::Owned(Ident::new("MyType5"))));
prog.types.insert(Ident::new("MyType5"), LocBox::new(&loc, Type::Owned(Ident::new("MyType4"))));
prog
}

View File

@@ -0,0 +1,48 @@
use mclangc::{
cli::CliArgs,
common::{Loc, loc::LocBox},
parser::{self, ast::{
Ident, Keyword, Number, Punctuation, TokenType, expr::Expr, literal::Literal,
statement::{Statement, StaticVar},
}},
tokeniser::Token, validator::predefined::BuiltinType
};
#[test]
fn test_parse_static_stat() {
let mut tokens = vec![
Token::new_test(TokenType::Keyword(Keyword::Static)),
Token::new_test(TokenType::ident("MyStatic")),
Token::new_test(TokenType::Punct(Punctuation::Colon)),
Token::new_test(TokenType::ident("usize")),
Token::new_test(TokenType::Punct(Punctuation::Eq)),
Token::new_test(TokenType::number(69, 10, false)),
Token::new_test(TokenType::Punct(Punctuation::Semi)),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
];
tokens.reverse();
let res = parser::stat::parse_statement(&mut tokens, &CliArgs::default(), &mut super::get_prog());
assert!(res.is_ok(), "{res:?}");
match res {
Ok(res) => {
assert!(res.is_some(), "{res:?}");
match res {
Some(res) => {
assert_eq!(res.inner().clone(), Statement::StaticVar(StaticVar {
name: Ident::new("MyStatic"),
typ: LocBox::new(&Loc::default(), BuiltinType::usize()),
val: LocBox::new(&Loc::default(), Expr::Literal(String::new(), Literal::Number(Number { val: 69, base: 10, signed: false })))
}));
}
None => unreachable!()
}
}
Err(_) => unreachable!()
}
assert!(tokens.len() == 3, "leftover token count incorrect");
}

View File

@@ -0,0 +1,48 @@
use mclangc::{
cli::CliArgs,
common::{Loc, loc::LocBox},
parser::{self, ast::{
Ident, Keyword, Number, Punctuation, TokenType, expr::Expr, literal::Literal,
statement::{Statement, StaticVar},
}},
tokeniser::Token, validator::predefined::BuiltinType
};
#[test]
fn test_parse_alias_stat() {
let mut tokens = vec![
Token::new_test(TokenType::Keyword(Keyword::Static)),
Token::new_test(TokenType::ident("MyStatic")),
Token::new_test(TokenType::Punct(Punctuation::Colon)),
Token::new_test(TokenType::ident("usize")),
Token::new_test(TokenType::Punct(Punctuation::Eq)),
Token::new_test(TokenType::number(69, 10, false)),
Token::new_test(TokenType::Punct(Punctuation::Semi)),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
Token::new_test(TokenType::ident("overflow")),
];
tokens.reverse();
let res = parser::stat::parse_statement(&mut tokens, &CliArgs::default(), &mut super::get_prog());
assert!(res.is_ok(), "{res:?}");
match res {
Ok(res) => {
assert!(res.is_some(), "{res:?}");
match res {
Some(res) => {
assert_eq!(res.inner().clone(), Statement::StaticVar(StaticVar {
name: Ident::new("MyStatic"),
typ: LocBox::new(&Loc::default(), BuiltinType::usize()),
val: LocBox::new(&Loc::default(), Expr::Literal(String::new(), Literal::Number(Number { val: 69, base: 10, signed: false })))
}));
}
None => unreachable!()
}
}
Err(_) => unreachable!()
}
assert!(tokens.len() == 3, "leftover token count incorrect");
}