In many languages, on of the things I find myself doing is maping over a list to extract some field. For example, coverting a []Person to []Name.

Most languages these days have ways to do this pretty easily:

Kotlin:

people.map { it.Name }

JavaScript:

people.map(p => p.Name)

Rust:

people.map(|p| p.Name)

Scala:

people.map(_.Name)

With generics, Go finally can do this in a type safe manner:

Map(people, func(t Person) string {
  return t.Name
})

... but we immediately stand out amongst other languages as having ugly, verbose syntax.

Can we do better?

Warning: Bad Ideas ahead! Do not use in real world.

Baseline

Our initial implementation serves as a starting point:

func Map[T, U any](data []T, f func(T) U) []U {
	res := make([]U, 0, len(data))
	for _, e := range data {
		res = append(res, f(e))
	}
	return res
}
func Example() {
	Map(people, func(t Person) string {
		return t.Name
	})
}

This actually isn't too bad. The implementation is fine, and its type safe. Its just unwieldy to deal with when used over and over. In particular, requiring to specify the input and output types.

Offsetof

My hope is that we can somehow use a syntax like:

Map(people, Person{}.Name)

and somehow get Map to find out the field we are accessing within Person and apply that to the entire array. This would allow fairly simple syntax (still not on par with other languages, though) and retain type safety.

One way to do this is to use unsafe.Offsetof. This looks like a normal function, but its actually a builtin, and does basically what we want, giving us the offset of a field within a struct.

unsafe.Offsetof(People{}.Name) // 8

Since its a builtin, though, we have to literally call it like this; we cannot hide the call inside Map. At compile time, the expression is replaced with the constant (8, in this case), so deferring the call doesn't work.

But we can still make it work:

func MapOffset[T, U any](data []T, offset uintptr) []U {
	return Map(data, func(t T) U {
		return *((*U)(unsafe.Add(unsafe.Pointer(&t), offset)))
	})
}
func Example() {
	MapOffset[Person, string](people, unsafe.Offsetof(Person{}.Name))
}

Note: this relies on the first Map implementation for simplicity.

This is not too bad. We do have to explicitly specify both types, though, which is a bit annoying. We also have unsafe screaming at us every time we call it, rather than hiding away in the Map function where we can pretend it doesn't exist.

Abusing pointers

We use Offsetof to get the offset of a field in the struct. But we can also get the offset if we had a pointer to the start of the struct and to the field we care about; some simple pointer math will get us the offset.

A basic approach would look like:

p := Person{}
Map(people, &p, &p.Name)

But this violates our desire to have a single line, and we need to pass 3 parameters.

What we need is a stable pointer to a Person (it could be any Person) between the call to Map and within Map.

Another option is to abuse the data already in the list:

Map(people, &people[0].Name)

This mostly works, but breaks when the array is empty. In another language, we could defer the execution like _ => &people[0].Name and only execute it if the list is non-empty. If we could do that, though, we wouldn't have this problem in the first place.

Instead, we can maintain a global registry of types, storing Type -> Singleton Instance of Type.

var types = sync.Map{}
func it[T any]() *T {
	t := new(T)
	res, _ := types.LoadOrStore(reflect.TypeOf(t), t)
	return res.(*T)
}
func MapIt[T, U any](data []T, ptr *U) []U {
	base := uintptr(unsafe.Pointer(it[T]()))
	field := uintptr(unsafe.Pointer(ptr))
	return MapOffset(data, field-base)
}
func Example() {
	MapIt(arr, &it[Person]().Name)
}

Going further: AST parsing

I've become fairly convinced we cannot achieve our north star (Map(people, People{}.Name)) with "normal" Go.

But we could parse the AST ourselves and figure things out. Even better, this would allow nested traversal (People{}.Name.X), which other approaches lack.

//go:embed main.go
var fileContents string
var MapAST = createMapper(fileContents)
func createMapper(contents string) func() {
	lines := strings.Split(fc, "\n")
	return func() {
		// Find the line we were called from
		_, _, l, _ := runtime.Caller(2)
		parser.ParseExpr(lines[l-1])
		// ... Do the rest of the parsing...
		// ... Use reflection to traverse the struct based on parsed expression ...
	}
}

I couldn't bring myself to actually fully write this code...

Going further: mprotect

The linux mprotect call can restrict access to portions of memory. lukechampine/freeze uses this to "freeze" a struct, panicing when anything is mutating.

Similar logic could be used to restrict reads as well. The idea would be to restrict each field in the struct one-by-one, execute the users struct traversal, and catch the panics to figure out what they are reading.

Aside from being extremely inefficient and a crazy hack, it also isn't practical. We run into the same issue as Offsetof; we cannot defer this logic into the Map function as it is too late at that point. This means users would need to do all this "freeze struct and catch panic" in their own code, defeating the purpose.

Additionally, mprotect sends a SIGSEGV rather than a simple panic. We could use SetPanicOnFault to change this behavior. Fortunately, this is a per-goroutine setting as well, so we "just" have to do all of this logic in a goroutine.

Going further: doing it right

Github issue.