A modest proposal for the ChoiceScript language

Obviously, this thread title is tongue-in-cheek. But there’s certain features that are lacking compared to other languages that I’d like to see, things which would help prevent bugs and improve the authoring experience.

First, the goal of any ChoiceScript changes would be to make the language easier to use, and to encourage good practices that will keep code more consistent and easier to maintain and help prevent bugs. Most importantly, I’d say, any changes like this need to not break any existing code.

So, here are three issues, or three parts of one big issue:

1. There are no constant values.

A constant is like a variable that never changes. They’re great to use in a narrative game context to set, for example, target values that skills and stats will be compared against to determine the results of #options. That way, you can test the gameplay and see if these values are too easy or hard, and they can be immediately adjusted all in one place, at the top of the scene file.

For instance, I’ve started defining and using values as if they were constants in my current project:

  *comment integer-stat-testing target values
  *temp TEST_EASY 2
  *temp TEST_MED 3
  *temp TEST_TOUGH 4

  *comment secondary value testing points
  *temp THRESHOLD_MINOR 10
  *temp THRESHOLD_MAJOR 20

(By convention, constants are typically named in ALLCAPS, but this isn’t strictly necessary.)

Then, whenever I test a skill, I can test *if skill > TEST_EASY, or *if secondary_value > THRESHOLD_MINOR, and so on.

This provides a compact, easy place to keep all these important numbers, to ensure they’re all consistent. If I want to adjust one for balancing purposes, I know I can’t accidentally overlook a single instance hidden somewhere.

Defining these as *temps also lets me adjust difficulty every scene; later scenes should have more difficult skill tests than earlier ones, for instance.

But since these are variables, not constants, this is a bit dangerous. For instance, I might accidentally

  *set TEST_EASY successes

when I meant to do the reverse, and then not realize my mistake until I noticed that my skill checks were easier or harder than they were supposed to be in this chapter. Depending on how diligent I am about testing manually or how closely I read randomtest transcripts, that bug might take months to recognize! Whereas if TEST_EASY were a true constant, that line would immediately throw an error.

2. There’s no dedicated enumerated types system.

This is closely related to the last issue; enumerated values in other languages are typically constants, and I feel they should be here, as well. Multireplace and arrays may have made this concept a bit less necessary than it had been a few years ago, but it’s still not really the ideal situation.

Basically, the idea with enumerations, aka “enums”, is there is a set of certain things which are all related, and one (or more than one) variable whose values are limited only to them.

Gender is the classic example. Say the player (or another important character) can be one of female, male, or nonbinary.

If you implement those three as strings, that’s dangerous; what if you misspell one of the words? CS has no way of knowing what a valid “gender” string is supposed to be. It doesn’t even know the concept exists. If you’re not careful, the game will just fail silently, not displaying any word when it always should display one of them. The only way you’ll even get an error message is if you don’t use an *else in your gender-related code.

Alternately, you could make it a number. You could even define those various gender (or whatever) possibilities as unique numbers in startup.txt:

  *create FEMALE 1
  *create MALE 2
  *create NONBINARY 3

etc.

Then you can just say *set sam_sex MALE wherever relevant in your scene, which makes your code easy to read. You can also use multireplace on any enum variable:

  She rolls her eyes. "@{sam_sex Girls|Guys|People} like Sam over there… 
  Well, they're hard to understand."

One problem, though, is these are treated as simple numbers internally by CS; the language doesn’t know that 1 and 2 and 3 are valid sexes, but 0 and -2 and 17 and 0.625 are not. If you somehow accidentally set someone’s sex to an invalid number, the only way you’ll realize it is later on, some lines or paragraphs or even whole scenes later, because an error message will only pop up whenever you next try to test the value. This makes tracking down where the problem actually occurred incredibly difficult.

And, because they’re all numbers, you can do weird stuff like *set FEMALE ((MALE * MALE) - FEMALE). You shouldn’t do that, obviously, but it is valid code, and it will run.

If enums instead got their own dedicated type system, you couldn’t set a “gender” variable to anything but FEMALE or MALE or NONBINARY. Any attempt to perform mathematical operations on it would throw an error message. And any attempt to set the variable to anything else, “male” or “MALE”, 0 or -5, or even 1 or 2, would as well. Better to fail immediately with a clear explanation than to misbehave, or fail silently, or throw an error message ten minutes later pointing to the wrong part of your code.

I do use fake enums like this in my own latest project. The best way to demonstrate the concept might be in an example:

startup.txt
	*comment Sexes
	*create FEMALE 1
	*create MALE 2
	*create THEMS 3
	*create PLURAL 4
	*create OBJECT 5

	*comment Pre-populate the pronouns
	*create they_1 "she"
	*create they_2 "he"
	*create they_3 "they"
	*create they_4 "they"
	*create they_5 "it"

	*create them_1 "her"
	*create them_2 "him"
	*create them_3 "them"
	*create them_4 "them"
	*create them_5 "it"

…etc., for all the pronouns.

Then, in a scene.txt file

Defining these pseudo-enums in startup lets me use them dynamically in paragraphs, thanks to multireplace and arrays:

  *temp sam_sex MALE
  *if pc_sex = MALE
    *set sam_sex FEMALE

  Sam walks up. You shoot ${them[sam_sex]} a meaningful look.
  "Well," ${they[sam_sex]} says. "Perhaps."

This will automatically display as

if Sam is male, and

if female.

3. There’s not enough type safety. (Or, maybe not enough types.)

Not quite as major as the other two, but it’s all more or less interrelated. The way ChoiceScript checks values means a lot of user errors can slip through, becoming very difficult to track down.

For example: ChoiceScript lets you add a fairmath number to a fairmath number, even if that would bring its value above 100; or subtract the one from the other, even if that would bring its value below 0; or perform any other operation that would bring its value out of the 0-100 bound. This is almost always a bug, but CS lets you do it, because it doesn’t recognize that a number for use in fairmath is supposed to differ in any way from a “money” or “number of troops” or “acres of land owned” sort of number.

Again, the only way you’ll realize you’ve accidentally set a fairmath number too high or low is later on, possibly even in a different scene than where you set it, whenever you use that variable in another fairmath calculation. The culprit is usually some + or - that was accidentally written when the author meant to write %+ or %-. If there was a fairmath-number type, CS could throw an error message immediately whenever its value was set outside the valid fairmath range.

If the author could explicitly define variable types from the outset—or maybe there was a use_strict_typing flag, akin to implicit_control_flow, which required the author to explicitly declare the type for every *create and *temp variable—this would prevent such hard-to-track-down bugs.

Even if such a flag wasn’t used, CS could dynamically determine variable types, as it already does right now. Number, string, and boolean types would be determined as they currently are; enums would be recognized when they’re set or initialized to an enumerated value like MALE or FEMALE or BLOND or BLUE; fairmath values whenever a *set A %+ B or *set A %- B was done to them.

Thoughts?

I have some ideas about potential syntax for these proposed features, as well, but this post is long enough already. First, I’d like to hear what you think about them.

And, as I mentioned before, I want to emphasize that none of these changes should break your code. Well, OK, maybe if you’re doing some kind of complicated math on a fairmath value, without using some intermediate *temp variables, while still managing to constrain the result within 0-100 at the very end. Or, if you’re assigning numbers to string variables and vice versa. But, uh, don’t do that.

3 Likes

Quick response since I’m on mobile (might do detailed one later or not):

I disagree with 1 and 2, but I can see point 3 to be applied.

Dabbling with Octave for some weeks, I can see the utility of global command on that program. However, the “problem” solved by that utility itself can be avoided by simply leaving the variables alone; what you suggested would be basically “make *else as bugcatcher-checks built-in and mandatory for particular variables.”

I like the idea of giving coders freedom to do their things. Your point 1 & 2, I feel like, seem to add certain level of complexity to CS itself, as well as adding a – hypothetically – complicated material to the documentation manual.

2 Likes

a modest proposal

I also think ChoiceScript should devour its young! :wink:

7 Likes

Quick reply from phone. I see the point of 1, 3 (despite your explanations at narrascope I still can’t quite understand 2).

But… I think these should be optional? Kind of like what Fortran 95 does with Fortran 77. Fortran 95 let’s you define variables… Or not. I
suspect newcomers to programming might be happier not defining variables… While those with experience understand it’s value?

I’ll try to think more about 2 tomorrow… Though if I don’t get it straight away maybe other newcomers to choicescript might also struggle?

1 Like

All of these problems are easy to avoid. Like, there is absolutely a way to know if you’ve set a gender variable wrong. You run the game, and if you see a spot where there should be gendered text and you see that it’s displaying incorrectly, go to that spot and fix it.

Also, just because you CAN do math with a gender variable that uses numbers to determine gender doesn’t mean you WILL. You have to pretty consciously choose to run math code with the gender variable for that to happen.

If you have a clear idea of how to execute the code, it’s incredibly easy to execute said code without creating errors for yourself.

2 Likes

Yes, all of these things would be optional, the same as variable use in general.

I don’t see what’s complex about #1?

  Constants are like variables, except they never change.
  Use them to give names to certain numbers, such as testing values for each scene.

  >Example

It should also be trivially easy to add to the language, so that shouldn’t add any complexity to CS. Just don’t allow *set commands on constants.

Enums can be a bit trickier to set up, but I think they make sense semantically like any other variable types. For example, some things can be any arbitrary whole number, like counts; some things can be any arbitrary decimal number, like average ratings; some things can be any arbitrary word or phrase, like names; and then some things are limited to only one item from a predefined set. If you’re buying a train ticket in San Diego, for example, there’s only a few possible destination cities. You can’t buy a train ticket there to Paris or Montevideo because they’re not in that set.

You can already use variables as a constant. Just never use *set to change a variable and it won’t change. The example you provided is already possible through Choicescript. Create a variable for a testing value, never use *set to change that value, and then compare a stat against that variable for checks.

Enums seem redundant when you can just write a *comment line explaining what the numbers of a variable are when you make the variable in startup.txt.

*comment 1 = machete, 2 = baseball bat, 3 = switchblade, 4 = pistol
*create weapon 0
*create pistol_ammo 0

*comment 1 = stretching cord, 2 = almanac, 3 = running shoes, 4 = fidget cube
*create valuable 0

Sure, if you simply always write correct code, your code will always be correct. But humans aren’t perfect or infallible. These are purely preventive measures an author can take to help prevent any difficult-to-track-down bugs in the future.

Right, like I mentioned, I’m doing all of these things already in my project. But CS doesn’t know anything about those systems, and it doesn’t know what valid inputs which would allow it to immediately identify typos, or accidental *sets applied to constants, or other mistakes. It’s all on the author to detect any problems using these structures; the language does nothing to help at all.

Do you use CSIDE? Just so I can understand the context of what you’re working with right now.

1 Like

To be honest, I’m not sure what these proposals would add to the language except more complexity. ChoiceScript is designed to be easy to learn, and one of the things that makes it easy to learn is the fact there aren’t that many things to learn.

While I’m not against more complexity per se, I think any increase in complexity should come with a greater or equal increase in functionality. I don’t see how these changes would increase functionality much. You can already create pseudo-constants and pseudo-enums, and I feel anyone knowledgeable enough to do that is capable enough to write code that minimizes the risks of ChoiceScript’s lack of bounds checking and type safety.

Concerned about using regular arithmetic instead of fairmath? Write a subroutine to handle incrementing/decrementing your fairmath variables. Concerned about variables being out of range? Use a subroutine that checks their values. Between the automated tests, subroutines, *bug, basic proofreading, and on-forum playtesting, most bugs are pretty easy to spot.

4 Likes

Not saying this to be condescending, but are these bugs common? The issue I encounter the most is accidentally checking boolean variables for true instead of false or vice-versa, and I usually notice that when the wrong text displays after a certain choice. I’ve never even heard of the errors you’ve brought up (which doesn’t mean they don’t happen - I’m just skeptical of the frequency in which the occur that would necessitate a change to the programming language).

Mirroring LeoXII’s comment, do you use CSIDE or Chronicler? Those programs drastically reduce these types of mistakes with contextual highlighting. I respect that you want to improve the coding language but the issues you brought up can be mitigated significantly with careful planning and the use of either CSIDE or Chronicler. Hell, I have exactly 10 minutes of coding experience and my coding bugs are usually easy to spot in this language.

1 Like

Yes, I use Notepad++ modified for ChoiceScript-specific syntax highlighting.

Screenshot

It still puts all the onus on the author to keep track of all these issues and track down any problems that arise, rather than allowing the language itself to take on the brunt of ensuring correctness. The act of planning and writing an interactive story is difficult enough on its own; I feel the system should be able to do most of the heavy lifting behind the scenes, to support the author.

Right, that’s exactly my motivation for these additions. They are purely quality of life improvements for authors.

The solutions you propose require the author to first have enough knowledge or experience to realize these are potential pitfalls of particular implementations in the first place; and then, to be a good enough coder to know how exactly to write such functions exactly perfectly; and even then, they still have to remember to call them after basically every single block of code, if not more than once in some blocks, to check every single global variable and every single scene-specific variable. Which means they’ll need a global function that gets called as well as a scene-specific function that it calls as well. And to prevent code duplication they’d want to call a subroutine for each variable they test, which incurs more overhead and another slight delay. And they’d need to remember to keep both the global and the local functions up to date with all the global and scene variables every time they add one, otherwise they would have an even more hard-to-track-down error, because this is supposed to be protecting them. In a big game, that could mean a lot of code to chug through, meaning potential performance issues on some of the phone platforms CS is supposed to run on. If bounds-checking were built-in, instead, it would only need to test a single variable per line, whichever one was just *set.

Whereas if these things were built into the language instead of being implemented ad-hoc, an author could just use them from the start as explained in the instruction manual, and they would all just work.

1 Like

Isn’t the purpose of Quicktest and Randomtest to allow for ease of tracking down issues to do with variable and set problems and ensure correctness? Or are you arguing that there should be something that checks those things without having to use the automated tests and manually fix them itself? If that’s the purpose, I’m not so sure. It seems like it would be rather difficult to develop an AI that can fix a code for you, a lot of the time because programs aren’t really able to understand an author’s intent: Robot doesn’t understand context or the method you’re using, especially considering a lot of people can use the code in different methods.

I can understand it’s a daunting task to keep track of different variables, types, and write while inputting them in multiple places, but to me that tends to be where proofreading tends to effectuate the process a bit. Sharing your work with others can certainly make the heavy task like that a much lighter burden.

It seems to me like the majority of your proposals would only be used by those who have a deeper understanding of the coding language. By that point, wouldn’t it be redundant to implement these when there are already simplistic fixes and ways to script for those that know how to or take the time to figure it out? I might be wrong in this regard, since I’m not the best person to ask for in-depth coding difficulties, but even without experience I don’t think I’ve been substantially set-back in writing because I was grueling over the code (granted my game isn’t suuper code-heavy, but all things considering still a lot of branching and variable checks).

2 Likes

I’m going to echo @trevers17 in asking how frequently you’re encountering these bugs?

My feeling was that by the time someone’s using constants and enums they’re advanced enough to write a subroutine that checks if a variable is between two numbers.

In regards to using subroutines to debug, I was imagining someone would use them every chapter or so, or when they want to double check everything is set correctly. And I don’t know why someone would leave them in the final project.

Overall, I’m just skeptical that these changes would catch bugs not immediately caught by:

  • Rereading your code
  • Frequent playtesting
  • Strategically placed *bug statements
  • Quick Test and Random Test
1 Like

Now I’m on pc, I can elaborate more on my previous comment.


Let’s say the hypothetical command is *constant (and *constant_temp for temp variation).

If constant by its definition a “locked variable” (well, it’s a constant), I won’t be able to do this

You gain the blessing of the divines. Your tasks should be easier, ahead .
*set constant -bonus

And should do this instead every time

*if skill > (constant -bonus)

This is what I’d call adding complexity (and limiting design space, to an extent). Sure, there’re workarounds, but once an author becomes skilled in CS, I think they’ll opt to leave *constant and prefer the simpler *create (compare: *line_break vs. [n/]).

In Octave, global command (equivalent of *constant) has its use to store universal value such as gravity-constant or soil’s angle of friction. This global can be modified inside of CS-equivalent the *subroutine, but only inside the scope of the subroutine. The constant will be “returned” to its global value once the program leaves the subroutine.


As for enums, either: 1) I’d rather have an actual, functional *array command; throwing another hypothetical term. Var[index] exists, but you have to *create the “child variables” first before putting them into the var[index]. Prone to mistakes, 2) Additional datatypes for floating-point number and non-negative integer. This can be something like *positive which throws error report when it goes below 0, or *decimal that stores X number behind zero (X being further modifiable?).

I do agree with your point 3, though. Fairmath is niche enough that it can use clearer separation from normal numbers (fairmath datatype). Heck, do *fairmath instead and forgo all that %+ %-.


Closure:
Despite being a “readable by non-coder,” CS is a programming language. Adding limitation to it would make it even more niche among other languages, and I feel CS already cuts out (haven’t implemented?) many elements that would add more freedom otherwise: in-line images or icons, table (or just column, for layout sake), checkbox and drop-down list (generally more GUI; we only have radio button for now), etc.

Granted, this would add complexity, but the utility of the language would dramatically increase too.

As it stands now, CS is perfectly fine for writers who aren’t too concerned with gameplay elements (you really won’t do *set FEMALE (MALE * MALE) - FEMALE anyway), but those who want to dabble more in gameplay elements will have to learn more about CS. *constant and the enums won’t add anything for these people.

Harmless commentary

I like your implementation

*create male 1
*create female 2
etc.

Looks cleaner and easier in the eyes. I always used *create gender 0 and set them to 1/2/3: male/female/nb. Very toiling when I have to @{gender him/her/them} everywhere.

2 Likes

If these are made optional, then I’m all for it. Basically, whoever wants to continue using Choicescript as it is can do so. Then, those of us who are making code-heavy games can implement them. I know for sure that options 1 and 3 would greatly help me. My latest game had 330,000 words, maybe around 25% of it code. I’ve seen Chris’ code (he is also very code heavy, and a more elegant and better programmer than me), and I can see why this would help him, and many others.

Obviously, somebody who is super-careful would not need 1 and 3… but most of us make mistakes at some point, and these can be difficult to find in long games (even with many proofreads, people playing and replaying, I still had many bugs regarding fairmath that I couldn’t locate). So, I would definitively be using 1 and 3… and the hangover today isn’t helping…)

2 Likes

I feel like constants and enums should be basic components of a language, not esoteric wizardry used by only a select few, because they make code easier to read and much easier to maintain and debug. They’re things everyone should use from the beginning.

Besides, people do use these patterns all the time, anyway, even though they don’t think of them explicitly in these terms. They write *if skill_a > 15 and *if skill_b > 16 and *if skill_c > 14 and *set char_a_relationship %+ 25, tossing out individual constants all over the place, hundreds or thousands every scene (of which there are probably only one or two dozen unique values), making for a huge mess to adjust if testing suggests any of the gameplay balance needs tweaking.

And they write

*choice
  #Bob is my trusty partner.
    *set partner "Bob"
  #Sam is always by my side.
    *set partner "Sam"
  #J.D. has been my partner through thick and thin.
    *set partner "J.D."

which is not typesafe and hence prone to typo or misremembrance errors. But they use this pseudo-enum pattern anyway because using a half-dozen boolean values instead, only one of which can be true at a time, is much more unwieldy, both to use and to ensure correct. Also because as a string its value can be printed outright; the only other reasonable alternative would be to use the integer-style pseudo-enum and a multireplace to print the name.

So, people are already using these patterns, because they make things much easier for the author, despite the risk. I’m just proposing to make them safe to use.

If you only checked every chapter, that wouldn’t catch any *temp level errors. Global pernicious bugs would be prevented, but not local ones.

I guess you could comment out all these tests while you’re not testing; you wouldn’t want to delete them because, well, you can (or at least should) never stop testing. Even after release, if you want to add a scene or change one thing to fix a bug or correct an oversight, you’d still want to run the test suite to make sure you haven’t inadvertently broken something else.

And besides, that’s still a ton of work you as an author need to plan and build and keep up-to-date to ensure your code doesn’t break, when that weight could be handled more simply and elegantly and efficiently by CS if it knew these patterns existed and what valid values for them were.

No, that’s exactly what I’m intending to use. Maybe I’m not explaining my goal properly. This wouldn’t detect any bugs that aren’t found without playtesting and Quicktest and Randomtest, because those are precisely the methods which find CS errors. My proposals would only add more error messages in more situations for playtesting and QT/RT to uncover.

Say you’re using a string-based pseudo-enum. A game using it can be in one of three states at any given moment:

  1. Performing correctly, exactly as intended.
  2. Unintentionally in an invalid state, but CS doesn’t know that and doesn’t throw an error. Depending on the future lines of code, it may never throw an explicit error based on this, making for an even more pernicious bug.
  3. Erroneous in such a way that CS (either in playtesting, QT, or RT) throws an error and quits. You misspelled a variable name or something.

My proposals aim to eliminate situation #2 entirely, turning all of those delayed timebomb cases (with potentially indefinite fuses) into #3. Thus, you would immediately hit the problem in testing, with a message pointing to the correct line of code, so you’d know precisely what you need to fix and where.

Yes, I ran into this sort of thing several times in my last big (released) project, and several times since then. I know CoG editors like Mary and Jason are very familiar with this issue, because they were talking about it at Narrascope and how hard to track down things like fairmath out-of-bound errors are thanks to the error message being generated long after the actual *set statement that was the culprit.

2 Likes

Yes, that is true, I remember also talking with Mary and Jason about this at Narrascope. So, it must be quite a significant issue… I also encounter this all the time, as I said (and drives me crazy). So, I would welcome the possibility of having them. But, I can also understand that it creates another barrier to entry for those who are not familiar with programming. Hence, I feel it would be better to make it optional… i.e. these types of variables exist, but you are free to continue to use them also in the way they are now.

What starts off optional can drift into being mandatory – in the name of coding efficiency, let alone the even more compelling cause of bug-hunting.

I’m still attracted to the idea that the language’s target user group includes authors with very limited programming nous, for whom “Advanced ChoiceScript” genuinely does include things like *line_break, *input_number, and nested conditionals. (That page doesn’t mention any features added in recent years, including multireplace or pseudo-arrays like “${them[sam_sex]}”, even in the for-programmers-only “truly bizarre” section toward the bottom of the page…)

That said, I think Chris’s most recent post does a good job of highlighting the benefits of these changes. As a non-programmer, I’ve no idea of the cost that would be involved in Dan’s time…but I’d support it if the c/b worked out favorably. As long as it remained optional rather than mandatory.

3 Likes

Well, sure, but this is a separate thing. If a value is going to change in the course of a story, it should be a global variable, not a constant. If a value is going to change in the course of a chapter, it should be a *temp, not a *temp_constant.

That’s a kind of weird idea, to be honest, though. Reducing the difficulty of all skill checks, or a certain type of skill check, is kind of the narrative-game equivalent of changing the gravitational constant of the universe or the coefficient of friction on a particular surface. It would probably be better represented by just giving the player a bonus to a certain skill, or class of skills, or all skills (for a global bonus). I know it’s technically not the same thing, but mathematically it works out the same; the difference is really only philosophical.

My feeling has been just the opposite. The more I’ve coded, the more I’ve wanted dedicated constants instead of using

  *if skill > 15
  *if skill > 16
  *if skill > 14
  *set relationship %+ 25
  *set relationship %+ 15

all over the place, and also the more I’ve used enums and wished they were typesafe.

I don’t think of variables as a “simpler” form of constants at all; they’re separate tools for separate needs. The only thing they have in common is the author names each.

Yeah, I looked that up about Octave after you mentioned it. That’s an interesting idea for subroutines, but I’m not sure it would be a good idea to implement that way for ChoiceScript…

I posit that the time for Dan to implement would be easily made up for in the time saved by other team members. The easiest to implement (I think! I’ve never written a full language parser like this) should be constants, then bounded types like fairmath.