Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add __rvalue(expression) builtin #17050

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

WalterBright
Copy link
Member

This adds the __rvalue(expression) builtin, which causes expression to be treated as an rvalue, even if it is an lvalue.

@WalterBright WalterBright added Enhancement WIP Work In Progress - not ready for review or pulling labels Nov 3, 2024
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @WalterBright!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#17050"

@WalterBright WalterBright force-pushed the __rvalue branch 4 times, most recently from 0f383fd to 60c0f9e Compare November 3, 2024 07:24
@thewilsonator thewilsonator added Needs Changelog A changelog entry needs to be added to /changelog Needs Spec PR A PR updating the language specification needs to be submitted to dlang.org labels Nov 3, 2024
@WalterBright WalterBright force-pushed the __rvalue branch 5 times, most recently from 082c48d to 6e0e44f Compare November 4, 2024 08:07
@nordlow
Copy link
Contributor

nordlow commented Nov 4, 2024

Can this __rvalue(...) be used in place of core.lifetime.move(...)?

@nordlow
Copy link
Contributor

nordlow commented Nov 4, 2024

Will the call to use in

S s;
__rvalue(s)
use(s);

be either

  1. a defined behavior where s is set to S.init by __rvalue(s)
  2. a compiler error (optionally in @safe code) at least in the case where S has indirections or destructors, or
  3. an undefined (@system) behavior when S has indirections or destructors

?

I prefer option 2 and is the closest to what Rust does. Can it be implemented with constant-time overhead by adding a status bit to the (parameter) variable declaration indicating that its contents has been invalidated by a move?

I recently saw a https://youtu.be/08gvuBC-MIE?t=1839 in which Jon Kalb admits that the standard committe made a mistake by forcing moved from object to in a so called "fully formed state" instead of a so called "partially formed state". As this prevents certain kinds of optimizations. For details see https://youtu.be/08gvuBC-MIE?t=1839.

@WalterBright
Copy link
Member Author

@nordlow I'll write a proper document for this after I figure out just what the end result will be!

@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

Thanks. Please see updates to my comment at #17050 (comment).

@WalterBright
Copy link
Member Author

My opinion on move semantics is that once you've moved s to t, then s's lifetime is over, and it should be in the default initialized state (a concept C++ doesn't have).

@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

My opinion on move semantics is that once you've moved s to t, then s's lifetime is over, and it should be in the default initialized state (a concept C++ doesn't have).

Is this realized by

  1. applying, if present, move constructor of S moving s to t and
  2. resetting all the bytes at s to S.init?

@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

Have you considered it making it a compiler error to access s after it has been moved? If not, why? I'm asking because this would lead to slight better performance in debug mode at least. And this is one of the reasons why Rust has this behavior.

@WalterBright
Copy link
Member Author

@WalterBright
Copy link
Member Author

Have you considered it making it a compiler error to access s after it has been moved?

Yes, but it requires Data Flow Analysis, which is slow.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

Have you considered it making it a compiler error to access s after it has been moved?

Yes, but it requires Data Flow Analysis, which is slow.

Ok, thanks.

Afaict, the complexity of supporting Rust-style r-value semantics depends on the context in which __rvalue would be used. For instance, in

S use(S);
S x;
auto y = __rvalue(x)
use(y); // allowed
use(x);  // disallowed

such a analysis could be implemented in the compiler with negligible overhead using an extra status bit in the Declaration node.

But in the general case I realize now that the compiler needs to recurse into all function calls that are passed l-values by reference as arguments.

Do you have a good reference to which data flow analysis in general and its applications such as this one?

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

I currently experimenting with using __rvalue defined in the branch of this MR in my code. Is __rvalue currently supposed to be wrapped in core.lifetime.move? If so, is

static if (__traits(compiles, { int x; const y = __rvalue(x); })) {
	import core.stdc.string : memcpy;
	T move(T)(return scope ref T source) @trusted {
		scope(exit) {
			static immutable init = T.init;
			memcpy(&source, &init, T.sizeof);
		}
		return __rvalue(source);
	}
	void move(T)(ref T source, ref T destination) @trusted {
		scope(exit) {
			static immutable init = T.init;
			memcpy(&source, &init, T.sizeof);
		}
		destination = __rvalue(source);
	}
} else
	public import core.lifetime : move;

/// unary move()
pure nothrow @nogc @safe unittest {
	auto x = S(42);
	assert(x == S(42));
	const y = move(x);
	assert(y == S(42));
	assert(x == S.init);
}

/// binary move()
pure nothrow @nogc @safe unittest {
	auto x = S(42);
	assert(x == S(42));
	S y;
	move(x, y);
	assert(y == S(42));
	assert(x == S.init);
}

version(unittest) {
	struct S { @disable this(this); int x; }
}

a suitable rewrite of the core.lifetime.move overloads?

@TurkeyMan
Copy link
Contributor

__rvalue() is a bad name for a move() intrinsic, the answer you seek is yes, __rvalue is exactly a replacement for move(), and it should be named move(). I'll argue for this before it's merged, but we're making very good progress here! :)

@TurkeyMan
Copy link
Contributor

Also no, this can't be 'wrapped', it's an intrinsic; it needs to be renamed move, you can't wrap it in a function named move.

@TurkeyMan
Copy link
Contributor

The end goal is to completely delete core.lifetime. Don't try to shoehorn this in there; that stuff is all dead.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

What about the binary overload of move() and moveEmplace?

I'm asking because this MR needs to include the druntime modifications to core.lifetime that makes full use of __rvalue for the sake of deletion of the current very convoluted implementations of move and moveEmplace in core.lifetime.

It's important to note that current behaviour of core.lifetime.move conditionally resets the T source to its T.init value when certain conditions hold for T; specifically

"If T is a struct with a destructor or postblit defined, source is reset to its .init value after it is moved into target, otherwise it is left unchanged. "

I'm not sure this is in line with the behavior of __rvalue that Walter proposes in this MR.

See https://dlang.org/phobos/core_lifetime.html#.move for details.

Luckily druntime is now part of the dmd repo.

@tgehr
Copy link
Contributor

tgehr commented Nov 6, 2024

I agree with @nordlow that __rvalue is more low-level than move. E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

I agree with @nordlow that __rvalue is more low-level than move. E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Nevertheless, I personally believe it is highly preferable to make all the overloads of move and moveEmplace become builtins using their existing name. This is gonna be a breaking change for code that rely on those symbols being templates but projects can be adjusted.

Btw, I'm working on migrating std.traits to become builtin __traits for the sake of reducing template bloat in std.traits. I yet again remind us all of the fact that the C++ standard, per definition, enforces implementations to lower all symbols in std.traits to builtins. For the same reason that I believe that most (or all) druntime's traits and std.traits should be converted to builtin __traits.

@WalterBright
Copy link
Member Author

such a analysis could be implemented in the compiler with negligible overhead using an extra status bit in the Declaration node.

Such certainly looks tempting, but it falls short as soon as the flow control becomes non-trivial. The flow analysis I know comes from class notes in a class I took on optimizations.

@WalterBright
Copy link
Member Author

emplace() will be replaced with the placement new operator, I posted a DIP on it in the development forum.

@WalterBright
Copy link
Member Author

E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Currently it will get destructed twice.

@WalterBright
Copy link
Member Author

This work is heavily based on Timon Gehr's and Manu Evans' contributions.

@TurkeyMan
Copy link
Contributor

E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Currently it will get destructed twice.

Another case naturally resolved with caller-destruction!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Changelog A changelog entry needs to be added to /changelog Needs Spec PR A PR updating the language specification needs to be submitted to dlang.org WIP Work In Progress - not ready for review or pulling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants