The FORTH stuff was ment to let you know about how a good and simple stack-based code works. As for converting it into a register-based...
Well, I see two possible approaches:
1- each register can hold the last element on its associated stack (or a pointer to that element, of course - C/Pascal issues involved here). So the register-based code could handle as many stacks as register-stack pairs are made available.
Although this approach looks like it emproves something, I think it just multiplies the number of stacks the code can handle simultaneously, but there is a high price to pay usually; and that price comes dued to the lack of stack-management-optimisation for other register types than stack-dedicated registers (SS, SP, BP...)
2- registers could be thought as holding the last "n" cells on the unique stack they are associated with; so you'll get a nice stack mirroring. That will deffinitely work faster since it's "cheaper" to use any general-purpose registers contents than any memory zone's contents - even handled by specially assigned registers (stack registers). Of course the registers have to be oragised circularly - for best application's performance... that will make the worst compiler's performance
How about the second approach?
================================================
((cons(car X)(cdr X))X)
holds(X,P):-P(X);holds(Y,P),IsA(X,Y).
Any (more) questions? SHOOT!