Funky C for literate programming

-

Main ideas

This is a port of LLIte in C. The rea­son for it is to ex­per­i­ment with writ­ing func­tional code in stan­dard C and com­pare the ex­pe­ri­ence with us­ing a func­tional lan­guage like F#. It is in a way a con­tin­u­a­tion of my pre­vi­ous posts on the topic.

I will be us­ing glib and an header of con­ve­nient macros/​func­tions to help me (lutils.h). I don’t think that is cheat­ing. Any mod­ern C prati­coner has its bag of tricks …

Don’t tell me this is not id­iomatic C. I al­ready know that.

#include <string.h>
#include <stdbool.h>

#include <glib.h>
#include <glib/gprintf.h>

#ifdef ARENA
#include "arena.h"
#endif

#include "lutils.h"

Lack of tuples

In the snip­pet be­low I over­comed such de­fi­ciency by de­clar­ing a struct. Using the new con­struc­tor syn­tax makes ini­tial­iz­ing a sta­tic table sim­ple.

typedef struct LangSymbols { char language[40]; char start[10]; char end[10];} LangSymbols;

static
LangSymbols* s_lang_params_table[] = {
    &(LangSymbols) {.language = "fsharp",   .start = "(*" "*", .end = "*" "*)"},
    &(LangSymbols) {.language = "c",        .start = "/*" "*", .end = "*" "*/"},
    &(LangSymbols) {.language = "csharp",   .start = "/*" "*", .end = "*" "*/"},
    &(LangSymbols) {.language = "java",     .start = "/*" "*", .end = "*" "*/"},
    NULL
};

Folding over arrays

I need to gather all the lan­guages, aka per­form a fold over the ar­ray. You might have no­ticed the propen­sity to add a NULL ter­mi­na­tor marker to ar­rays (as for strings). This al­lows me to avoid pass­ing a size to func­tions and makes sim­pler writ­ing util­ity macros (as fore­ach be­low) more sim­ply.

In the rest of the pro­gram, every time I end a func­tion with _z, it is be­cause I con­sider it gen­er­ally us­able and I add a ver­sion of it with­out the _z to lu­tils.h.

#define array_foreach_z(p) for(; *symbols != NULL; ++symbols)

static
char* summary(LangSymbols** symbols) {

    GString* langs = g_string_sized_new(20);
    array_foreach(symbols) g_string_append_printf(langs, "%s ", (*symbols)->language);

    g_string_truncate(langs, strlen(langs->str) - 1);

    GString* usage = g_string_sized_new(100);

    g_string_printf(usage,
        "You should specify:nt. either -l or -o and -pn"
        "t. either -indent or -P and -Cn"
        "t. -l supports: %s"
        ,langs->str);

    return usage->str;
}

Find an item in an ar­ray based on some ex­pres­sion. Returns NULL if not found. Again, this is a com­mon task, hence I’ll ab­stract it out with a macro (that ends up be­ing a cute use of gcc stat­ment ex­pres­sions).

#define array_find_z(arr, ...)                          
    ({                                                  
        array_foreach(arr) if (__VA_ARGS__) break;      
        *arr;                                           
    })

static
LangSymbols* lang_find_symbols(LangSymbols** symbols, char* lang) {
    g_assert(symbols);
    g_assert(lang);

    return array_find(symbols, !strcmp((*symbols)->language, lang));
}

Deallocating stuff

You might won­der why I don’t seem overly wor­ried about deal­lo­cat­ing the mem­ory that I al­lo­cate. I haven’t gone crazy(yet). You’ll see.

Discriminated unions

Here are the dis­crim­i­nated unions macros from a pre­vi­ous blog post of mine. I’ll need a cou­ple of these and pre-de­clare two func­tions.

union_decl(CodeSymbols, Indented, Surrounded)
    union_type(Indented,    int indentation;)
    union_type(Surrounded,  char* start_code; char* end_code;)
union_end(CodeSymbols);

typedef struct Options {
    char*           start_narrative;
    char*           end_narrative;
    CodeSymbols*    code_symbols;
} Options;

static
gchar* translate(Options*, gchar*);

union_decl(Block, Code, Narrative)
    union_type(Code,        char* code)
    union_type(Narrative,   char* narrative)
union_end(Block);

Main data structure

We want to use higher level ab­strac­tions that stan­dard C ar­rays, hence we’ll pick a con­ve­nient data struc­ture to use in the rest of the code. A queue lets you to in­sert at the front and back, with just a one pointer over­head over a sin­gle linked list. Hence it is my data struc­ture of choice for this pro­gram.

static
GQueue* blockize(Options*, char*);
There is already a function in glib to check if a string has a certain prefix (g_str_has_prefix). We need one that returns the remaining string after the prefix. We also define a g_slow_assert that is executed just if G_ENABLE_SLOW_ASSERT is defined

static
char* str_after_prefix(char* src, char* prefix) {
    g_assert(src);
    g_assert(prefix);
    g_slow_assert(g_str_has_prefix(src, prefix));

    while(*prefix != '0')
        if(*src == *prefix) ++src, ++prefix;
        else break;

    return src;
}

Tokenizer

The struc­ture of the func­tion is iden­ti­cal to the F# ver­sion. The big bread-win­ners are state­ment ex­pres­sions and lo­cal func­tions …

It is in­ter­est­ing how you can repli­cate the shape of an F# func­tion by sub­sti­tut­ing ternary op­er­a­tors for match state­ments.

It is noth­ing magic, just a way to have a case stat­ment as an ex­pres­sion, but it is sug­ges­tive of its more func­tional coun­ter­part.

#define NL "n"

union_decl(Token, OpenComment, CloseComment, Text)
    union_type(OpenComment, int line)
    union_type(CloseComment,int line)
    union_type(Text,        char* text)
union_end(Token);

GQueue* tokenize(Options* options, char* source) {
    g_assert(options);
    g_assert(source);

    struct tuple { int line; GString* acc; char* rem;};

    bool is_opening(char* src)      { return g_str_has_prefix(src, options->start_narrative);}
    bool is_closing(char* src)      { return g_str_has_prefix(src, options->end_narrative);}
    char* remaining_open (char* src){ return str_after_prefix(src, options->start_narrative);}
    char* remaining_close(char* src){ return str_after_prefix(src, options->end_narrative);}

    struct tuple text(char* src, GString* acc, int line) {
        inline struct tuple stop_parse_text()
            { return (struct tuple) {.line = line, .acc = acc, .rem = src};}

        return  str_empty (src)? stop_parse_text() :
                is_opening(src)? stop_parse_text() :
                is_closing(src)? stop_parse_text() :
                                ({
                                  int line2         = g_str_has_prefix(src, NL) ? line + 1
                                                                                : line;
                                  GString* newAcc   = g_string_append_c(acc, *src);
                                  char* rem         = src + 1;
                                  text(rem, newAcc, line2);
                                });
    }

    GQueue* tokenize_rec(char* src, GQueue* acc, int line) {
        return  str_empty(src)  ?   acc                     :
                is_opening(src) ?   tokenize_rec(remaining_open(src),
                                        g_queue_push_back(acc, union_new(
                                                    Token, OpenComment, .line = line)),
                                        line)        :
                is_closing(src) ?   tokenize_rec(remaining_close(src),
                                               g_queue_push_back(acc, union_new(
                                                    Token, CloseComment, .line = line)),
                                        line)        :
                                ({
                                    struct tuple t = text(src, g_string_sized_new(200), line);
                                    tokenize_rec(t.rem,
                                        g_queue_push_back(acc, union_new(
                                                    Token, Text, .text = t.acc->str)), t.line);
                                 });
    }

    return tokenize_rec(source, g_queue_new(), 1);
}

Parser

This again has a sim­i­lar struc­ture to the F# ver­sion, just longer. It is very long be­cause it con­tains 3 (nested) func­tions which are on the ver­bose side in C.

The cre­ation of a er­ror macro is un­for­tu­nate. I just don’t know how to adapt g_as­sert_e so that it works for not pointer re­turn­ing func­tions.

I also need a sim­ple func­tion re­port_er­ror to exit grace­fully giv­ing a mes­sage to the user. I did­n’t found such thing in glib (?)

#define report_error_z(...) G_STMT_START { g_print(__VA_ARGS__); exit(1); } G_STMT_END                                                            

union_decl(Chunk, NarrativeChunk, CodeChunk)
    union_type(NarrativeChunk,  GQueue* tokens)
    union_type(CodeChunk,       GQueue* tokens)
union_end(Chunk);

static
GQueue* parse(Options* options, GQueue* tokens) {
    g_assert(options);
    g_assert(tokens);

    struct tuple { GQueue* acc; GQueue* rem;};

    #define error(...) 
        ({ report_error(__VA_ARGS__); (struct tuple) {.acc = NULL, .rem = NULL}; })

    struct tuple parse_narrative(GQueue* acc, GQueue* rem) {

        bool isEmpty    = g_queue_is_empty(rem);
        Token* h        = g_queue_pop_head(rem);
        GQueue* t       = rem;

        return  isEmpty                 ?
                                    error("You haven't closed your last narrative comment") :
                h->kind == OpenComment  ?
                    error("Don't open narrative comments inside narrative comments at line %i",
                          h->OpenComment.line)                                              :
                h->kind == CloseComment ? (struct tuple) {.acc = acc, .rem = t}             :
                h->kind == Text         ? parse_narrative(g_queue_push_back(acc, h), t)     :
                                          error("Should never get here");
    };

    struct tuple parse_code(GQueue* acc, GQueue* rem) {

        bool isEmpty    = g_queue_is_empty(rem);
        Token* h    = g_queue_pop_head(rem);
        GQueue* t   = rem;

        return  isEmpty                 ? (struct tuple) {.acc = acc, .rem = t}         :
                h->kind == OpenComment  ?
                    (struct tuple) {.acc = acc, .rem = g_queue_push_front(rem, h)}      :
                h->kind == CloseComment ? parse_code(g_queue_push_back(acc, h), rem)    :
                h->kind == Text         ? parse_code(g_queue_push_back(acc, h), rem)    :
                                          error("Should never get here");
    };
    #undef error

    GQueue* parse_rec(GQueue* acc, GQueue* rem) {

        bool isEmpty    = g_queue_is_empty(rem);
        Token* h    = g_queue_pop_head(rem);
        GQueue* t   = rem;

        return  isEmpty                 ? acc                                           :
                h->kind == OpenComment  ? ({
                                           GQueue* emp = g_queue_new();
                                           struct tuple tu = parse_narrative(emp, t);
                                           Chunk* ch = union_new(
                                                Chunk, NarrativeChunk, .tokens = tu.acc );
                                           GQueue* newQ = g_queue_push_back(acc, ch);
                                           parse_rec(newQ, tu.rem);
                                           })                                            :
                h->kind == CloseComment ?
                    report_error_e(
                        "Don't insert a close narrative comment at the start of your"
                        " program at line %i",
                                            h->OpenComment.line)                         :
                h->kind == Text         ?
                                        ({
                                           GQueue* emp = g_queue_new();
                                           struct tuple tu =
                                                parse_code(g_queue_push_front(emp, h), t);
                                           parse_rec(g_queue_push_back
                                            (acc,
                                             union_new(Chunk, CodeChunk, .tokens = tu.acc)),
                                             tu.rem);
                                          })                                                               :
                                          g_assert_no_match;
    }

    return parse_rec(g_queue_new(), tokens);
}

Flattener

This fol­lows the usual prac­tice of rep­re­sent­ing fold as fore­ach stat­ments (and maps to). Pheraps I shall build bet­ter ab­strac­tions for them at some point. I also in­tro­duce a lit­tle macro to sim­plify writ­ing of GFunc lamb­das, given how per­va­sive they are.

Again, note how heavy ternary op­er­ated this is …

#define g_func_z(type, name, ...) lambda(void,                                              
                                        (void* private_it, G_GNUC_UNUSED void* private_no){ 
                                       type name = private_it;                              
                                       __VA_ARGS__                                          
                                })

static
GQueue* flatten(Options* options, GQueue* chunks) {
    GString* token_to_string_narrative(Token* tok) {
        return  tok->kind == OpenComment ||
                tok->kind == CloseComment   ?
                    report_error_e("Cannot nest narrative comments at line %i",
                                   tok->OpenComment.line)                                   :
                tok->kind == Text           ? g_string_new(tok->Text.text)                  :
                                              g_assert_no_match;
    }
    GString* token_to_string_code(Token* tok) {
        return  tok->kind == OpenComment    ?
                report_error_e(
                    "Open narrative comment cannot be in code at line %i."
                    " Pheraps you have an open comment "
                    "in a code string before this comment tag?"
                    , tok->OpenComment.line)                                                :
                tok->kind == CloseComment   ? g_string_new(options->end_narrative)          :
                tok->kind == Text           ? g_string_new(tok->Text.text)                  :
                                              g_assert_no_match;
    }
    Block* flatten_chunk(Chunk* ch) {
        return  ch->kind == NarrativeChunk  ? ({
                               GQueue* tokens = ch->NarrativeChunk.tokens;
                               GString* res = g_string_sized_new(256);
                               g_queue_foreach(tokens, g_func(Token*, tok,
                                                g_string_append(
                                                    res,
                                                    token_to_string_narrative(tok)->str);
                                                ), NULL);
                               union_new(Block, Narrative, .narrative = res->str);
                                               })   :
                ch->kind == CodeChunk       ? ({
                               GQueue* tokens = ch->CodeChunk.tokens;
                               GString* res = g_string_sized_new(256);
                               g_queue_foreach(tokens, g_func(Token*, tok,
                                                        g_string_append(
                                                            res,
                                                            token_to_string_code(tok)->str);
                                                        ), NULL);
                               union_new(Block, Code, .code = res->str);
                                               })   :
                               g_assert_no_match;
    }

    GQueue* res = g_queue_new();
    g_queue_foreach(chunks, g_func(Chunk*, ch,
                                Block* b = flatten_chunk(ch);
                                g_queue_push_tail(res, b);
                                ) ,NULL);
    return res;
}

Now we can tie every­thing to­gether to build block­ize, which is our parse tree.

static
GQueue* blockize(Options* options, char* source) {
    GQueue* tokens  = tokenize(options, source);
    GQueue* blocks  = parse(options, tokens);
    return flatten(options, blocks);
}
10 Define the phases
In C you can easily forward declare function, so you don’t have to come up with some clever escabotage like we had to do in F#.

static
GQueue* remove_empty_blocks(Options*, GQueue*);
static
GQueue* merge_blocks(Options*, GQueue*);
static
GQueue* add_code_tags(Options*, GQueue*);

static
GQueue* process_phases(Options* options, GQueue* blocks) {

    blocks          = remove_empty_blocks(options, blocks);
    blocks          = merge_blocks(options, blocks);
    blocks          = add_code_tags(options, blocks);
    return blocks;
}

static
char* extract(Block* b) {
    return  b->kind == Code         ? b->Code.code          :
            b->kind == Narrative    ? b->Narrative.narrative:
                                      g_assert_no_match;
}

There must be a higher level way to write this util­ity func­tion …

static
bool is_str_all_spaces(const char* str) {
    g_assert(str);
    while(*str != '0') {
        if(!g_ascii_isspace(*str))
            return false;
        str++;
    }
    return true;
}

static
GQueue* remove_empty_blocks(G_GNUC_UNUSED Options* options, GQueue* blocks) {

    g_queue_foreach(blocks, g_func(Block*, b,
        if(is_str_all_spaces(extract(b)))
            g_queue_remove(blocks, b);
                                   ), NULL);
    return blocks;
}

static
GQueue* merge_blocks(G_GNUC_UNUSED Options*options, GQueue* blocks) {
    return  g_queue_is_empty(blocks)            ? blocks            :
            g_queue_get_length(blocks) == 1     ? blocks            :
                ({
                 Block* h1 = g_queue_pop_head(blocks);
                 Block* h2 = g_queue_pop_head(blocks);
                 h1->kind == Code && h2->kind == Code ? ({
                     char* newCode =
                        g_strjoin("", h1->Code.code, NL, h2->Code.code, NULL);
                     Block* b = union_new(Block, Code, .code = newCode);
                     merge_blocks(options, g_queue_push_front(blocks, b));
                                                         })         :
                 h1->kind == Narrative && h2->kind == Narrative ? ({
                     char* newNarr =
                        g_strjoin(
                            "", h1->Narrative.narrative, NL, h2->Narrative.narrative, NULL);
                     Block* b = union_new(Block, Narrative, .narrative = newNarr);
                     merge_blocks(options, g_queue_push_front(blocks, b));
                                                         })         :
                                                         ({
                     GQueue* newBlocks =
                        merge_blocks(options, g_queue_push_front(blocks, h2));
                     g_queue_push_front(newBlocks, h1);
                                                         });
                 });
}

This re­ally should be in glib …

inline static
gint g_asprintf_z(gchar** string, gchar const *format, ...) {
	va_list argp;
	va_start(argp, format);
	gint bytes = g_vasprintf(string, format, argp);
	va_end(argp);
    return bytes;
}

static
char* indent(int n, char* s) {
    g_assert(s);

    char* ind       = g_strnfill(n, ' ');
    char* tmp;
    g_asprintf(&tmp, "%s%s", ind, s);

    char* withNl;
    g_asprintf(&withNl, "n%s", ind);

    return g_strjoinv(withNl, g_strsplit(tmp, NL, -1));
}
And finally I ended up defining map. See if you like how the usage looks in the function below.

#define g_queue_map_z(q, type, name, ...) ({                                
        GQueue* private_res = g_queue_new();                                
        g_queue_foreach(q, g_func(type, name,                               
            name = __VA_ARGS__;                                             
            g_queue_push_tail(private_res, name);                           
            ), NULL);                                                       
        private_res;                                                        
                                      })

static
GQueue* add_code_tags(Options* options, GQueue* blocks) {

    GQueue* indent_blocks(GQueue* blocks) {
        return g_queue_map(blocks, Block*, b,
                b->kind == Narrative ? b                                                                                                    :
                b->kind == Code      ?
                    union_new(Block, Code, .code =
                        indent(options->code_symbols->Indented.indentation, b->Code.code))    :
                    g_assert_no_match;);
    }

    GQueue* surround_blocks(GQueue* blocks) {
        return g_queue_map(blocks, Block*, b,
                b->kind == Narrative ?
                    union_new(Block, Narrative, .narrative =
                        g_strjoin("", NL, g_strstrip(b->Narrative.narrative), NL, NULL))   :
                b->kind == Code      ?
                    union_new(Block, Code, .code = g_strjoin("",
                                                 NL,
                                                 options->code_symbols->Surrounded.start_code,
                                                 NL,
                                                 g_strstrip(b->Code.code),
                                                 NL,
                                                 options->code_symbols->Surrounded.end_code,
                                                 NL,
                                                 NULL))    :
                                       g_assert_no_match;);

    }

    return  options->code_symbols->kind == Indented     ?   indent_blocks(blocks)   :
            options->code_symbols->kind == Surrounded   ?   surround_blocks(blocks) :
                                                            g_assert_no_match;
}

char* stringify(GQueue* blocks) {
    GString* res = g_string_sized_new(2048);
    g_queue_foreach(blocks, g_func(Block*, b,
        g_string_append(res, extract(b));
    ), NULL);
    return g_strchug(res->str);
}

void deb(GQueue* q);

static
char* translate(Options* options, char* source) {
    g_assert(options);
    g_assert(source);

    GQueue* blocks  = blockize(options, source);
    blocks          = process_phases(options, blocks);
    return stringify(blocks);
}

Parsing the command line

In glib there is a com­mand line parser that ac­cept op­tions in unix-like for­mat and au­to­mat­i­cally pro­duces pro­fes­sional –help mes­sages and such. We shoudl re­ally have some­thing like this in .NET. Pheraps we do and I’m not aware of it?

typedef struct CmdOptions { char* input_file; char* output_file; Options* options;} CmdOptions;

static
CmdOptions* parse_command_line(int argc, char* argv[]);

static char *no = NULL, *nc = NULL, *l = NULL, *co = NULL, *cc = NULL, *ou = NULL;
static char** in_file;

static int ind = 0;
static bool tests = false;

// this is a bug in gcc, fixed in 2.7.0 not to moan about the final NULL
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wmissing-field-initializers"

static GOptionEntry entries[] =
{
  { "language"          , 'l', 0, G_OPTION_ARG_STRING, &l ,
                                "Language used", "L"  },
  { "output"            , 'o', 0, G_OPTION_ARG_FILENAME, &ou,
                                "Defaults to the input file name with mkd extension", "FILE" },
  { "narrative-open"    , 'p', 0, G_OPTION_ARG_STRING, &no,
                                "String opening a narrative comment",   "NO" },
  { "narrative-close"   , 'c', 0, G_OPTION_ARG_STRING, &nc,
                                "String closing a narrative comment",   "NC" },
  { "code-open"         , 'P', 0, G_OPTION_ARG_STRING, &co,
                                "String opening a code block",          "CO" },
  { "code-close"        , 'C', 0, G_OPTION_ARG_STRING, &cc,
                                "String closing a code block",          "CC" },
  { "indent"            , 'i', 0, G_OPTION_ARG_INT,    &ind,
                                "Indent the code by N whitespaces",    "N"  },
  { "run-tests"         , 't', G_OPTION_FLAG_HIDDEN, G_OPTION_ARG_NONE,   &tests,
                                "Run all the testcases", NULL },
  { G_OPTION_REMAINING  ,   0, 0, G_OPTION_ARG_FILENAME_ARRAY, &in_file,
                                "Input file to process",   "FILE" },
  { NULL }
};
#pragma GCC diagnostic pop

Brain dam­aged way to run tests with a -t hid­den op­tion. Not pay­ing the code size price in re­lease.

#ifndef NDEBUG
#include "tests.c"
#endif
Here is my big ass command parsing function. It could use a bit of refactoring …

void destroy_arena_allocator();

static
CmdOptions* parse_command_line(int argc, char* argv[]) {

    GError *error = NULL;
    GOptionContext *context;

    context =
        g_option_context_new ("- translate source code with comemnts to an annotated file");
    g_option_context_add_main_entries (context, entries, NULL);
    g_option_context_set_summary(context, summary(s_lang_params_table));

    if (!g_option_context_parse (context, &argc, &argv, &error))
        report_error("option parsing failed: %s", error->message);

    CmdOptions* opt = g_new(CmdOptions, 1);
    opt->options = g_new(Options, 1);

    #ifndef NDEBUG
    if(tests) {
        int i = run_tests(argc, argv);
        exit(i);
    }
    #endif

    if(!in_file) report_error("No input file");
    opt->input_file = *in_file;

    // Uses input file without extension, adding extension .mkd (assume markdown)
    opt->output_file = ou ? ou :  ({
                                  char* output      = g_strdup(*in_file);
                                  char* extension   = g_strrstr(output, ".");
                                  extension ? ({
                                               *extension = '0';
                                               g_strjoin("", output, ".mkd", NULL);
                                                }) :
                                               g_strjoin("", output, ".mkd", NULL);
                                  });

    if(l) { // user passed a language
        LangSymbols* lang = lang_find_symbols(s_lang_params_table, l);
        if(!lang) report_error("%s is not a supported language", l);

        opt->options->start_narrative  = lang->start;
        opt->options->end_narrative    = lang->end;

    } else {
        if(!no || !nc) report_error("You need to specify either -l, or both -p and -c");

        opt->options->start_narrative  = no;
        opt->options->end_narrative    = nc;
    }

    if(ind) { // user pass    g_option_context_free();
        opt->options->code_symbols = union_new(CodeSymbols, Indented, .indentation = ind);
    } else {
        if(!co || !cc) report_error("You need to specify either -indent, or both -P and -C");
        opt->options->code_symbols =
            union_new(CodeSymbols, Surrounded, .start_code = co, .end_code = cc);
    }

    return opt;
}

Some win­dows pro­grams (i.e. notepad, VS, …) add a 3 bytes pre­lude to their utf-8 files, C does­n’t know any­thing about it, so you need to strip it. On this topic, I sus­pect the pro­gram works on UTF-8 files that con­tain non-ASCII chars, even if when I wrote it I did­n’t know any­thing about lo­cal­iza­tion.

It should work be­cause I’m just split­ting the file when I see a cer­tain ASCII string and in UTF-8 ASCII chars can­not ap­pear any­where else than in their ASCII po­si­tion.

char* skip_utf8_bom(char* str) {
    unsigned char* b = (unsigned char*) str;
    return  b[0] == 0xEF && b[1] == 0xBB && b[2] == 0xBF    ? (char*) &b[3]  : // UTF-8
                                                              (char*) b;
}

Not freeing memory (again)

The rea­son I haven’t been free­ing mem­ory all along is be­cause I was plan­ning on us­ing an arena al­lo­ca­tor (a kind of lin­ear al­lo­ca­tor).

Memory man­age­ment is fully hor­tog­o­nal to the style of pro­gram­ming de­scribed in this post. You can do it what­ever way you pre­fer, but there is a cer­tain affin­ity be­tween an arena al­lo­ca­tor (or garbage col­lec­tion) and func­tional pro­gram­ming be­cause of the tem­po­rary ob­jects cre­ated in ex­pres­sions. You could cre­ate the tem­po­rary ob­jects ex­plicitely, but that would di­min­ish the con­cise­ness of the par­a­digm.

I have an arena al­lo­ca­tor im­ple­men­ta­tion here. In the code be­low I com­ment it out so that you don’t have a de­pen­dency from that code if you want to try this. The pro­gram runs so quickly and it does so lit­tle that you can prob­a­bly let the op­er­at­ing sys­tem reclame mem­ory at the end of the process life.

If you ended up in­te­grat­ing this with an ed­i­tor (i.e. lit­er­ate pro­gram­ming edit­ing), you’d need to be more care­ful.

#ifdef ARENA

Arena_T the_arena;

inline static
gpointer arena_malloc(gsize n_bytes) {
    return Arena_alloc(the_arena, n_bytes, __FILE__, __LINE__);
}

inline static
gpointer arena_calloc(gsize n_blocks, gsize n_block_bytes) {
    return Arena_calloc(the_arena, n_blocks, n_block_bytes, __FILE__, __LINE__);
}

inline static
gpointer arena_realloc(gpointer mem, gsize n_bytes) {
    return Arena_realloc(the_arena, mem, n_bytes, __FILE__, __LINE__);
}

void arena_free(G_GNUC_UNUSED gpointer mem) {
    // NOP
}

void set_arena_allocator() {
    GMemVTable vt = (GMemVTable) { .malloc = arena_malloc,      .calloc = arena_calloc,
                                   .realloc = arena_realloc,    .free = arena_free,
                                   .try_malloc = arena_malloc,  .try_realloc = arena_realloc};
    g_mem_set_vtable(&vt);

    the_arena = Arena_new();
}

void destroy_arena_allocator() {
    Arena_dispose(&the_arena);
}

#endif

Summary

I have to say, it did­n’t feel too cum­ber­some to struc­ture C code in a func­tional way, as­sum­ing that you can use GLib and a cou­ple of GCC ex­ten­sions to the lan­guage. It cer­tainly does­n’t have the prob­lems that C++ has in terms of de­bug­ging STL fail­ures.

There are a cou­ple of things I don’t like about GLib and I’m work­ing on an hobby pro­ject to over­come them. Eventually I’ll post it.

int main(int argc, char* argv[])
{
#ifdef ARENA
    set_arena_allocator();
#endif

    CmdOptions* opt = parse_command_line(argc, argv);

    char* source    = NULL;
    GError* error   = NULL;

    if(!g_file_get_contents(opt->input_file, &source, NULL, &error))
        report_error(error->message);

    source = skip_utf8_bom(source);

    char* text              = translate(opt->options, source);

    if(!g_file_set_contents(opt->output_file, text, -1, &error))
        report_error(error->message);

#ifdef ARENA
    destroy_arena_allocator();
#endif

    return 0;
}

Tags

10 Comments

Comments

We shoudl re­ally have some­thing like this in .NET. Pheraps we do and I’m not aware of it?“
http://​www.ndesk.org/​doc/​nd…
Simple source level sin­gle file de­pen­dency. Works a treat.

Pretty cool. I knew some­one would have done it.

Alois Kraus

2013-03-21T22:33:29Z

A C# con­sole ap­pli­caion tem­plate with a more func­tional com­mand line parser is also on Code Gallery: http://​vi­su­al­stu­dio­gallery….

thanks Alois.

Terrance Smith

2013-03-22T22:15:54Z

Dude, I am way im­pressed. F# and C are both lan­guages I slowly am at­tempt­ing to achieve some sort of mas­tery over so al­ready rel­e­vant to my in­ter­ests.
But then you did a F#->glib & C parser.
You just won the game.
I am cu­ri­ous about your other pro­jects as well in­clud­ing your C lib https://​github.com/​lu­cabol/…. Are you go­ing to add li­cens­ing to your stuff or just have it out there?

LLib has the most lib­eral li­cense I could find (BSD) and it just uses BSD com­po­nents. All the rest is just out there. Use what­ever you need.

Hi,
I would like to have the source of the above im­ple­men­ta­tion, but the link is miss­ing. Searched your github, too. Did i miss it?
Thanks.

John Martin

2014-08-03T01:52:15Z

I have a prob­lem is is dri­ving me crazy and I’m not sure how to post a ques­tion. It is re­gard­ing pass­ing a pointer (int *Pointer;) to a func­tion, hav­ing that func­tion as­sign the pointer with mal­loc, then, when the func­tion ends and I’m back in main, it de­stroys the pointer.

This looks like some­thing to post on stack­over­flow with a suit­able re­pro case. Cheers.