Sector Naming

From DISC Wiki
Revision as of 11:44, 26 December 2016 by Alot (Talk | contribs) (The System Identifiers)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Sector Naming
Decoding the names and positions of procedurally generated star systems
Project Details
Status Mostly Complete
Primary Contributors Jackie Silver
Alot
Reference Code Alot's EDTS

Aims

The main aim of this project was to understand how sectors in the E:D galaxy are named, and to be able to predict a system's location (at least approximately) given only its name.

This would allow for other tools (such as EDSM) to have sanity-checking on user-provided coordinates, and would also allow historical flight logs to be filled in with approximate locations. It would also be useful in other situations where approximate coordinates for a system are required without access to the galaxy map, such as Fuel Rat dispatching.

Background

Before getting to the meat of the problem, it's important to have a good idea of how the procedurally generated parts of the E:D galaxy is laid out.

The galaxy is divided into sectors, each of which have a unique name; each sector is a cube, 1280x1280x1280Ly in size. Every procedurally-generated system in the galaxy belongs to exactly one PG sector, although due to the presence of hand-crafted content, a system may not bear the name of the PG sector it's actually part of.

For example, both Sol and Col 285 Sector FH-I b11-1 are actually in the Wregoe sector. The latter example is slightly confusing: despite having the same form as a fully procedurally generated system, the sector itself (Col 285) is hand-placed and "overrides" the Wregoe systems that would normally appear there.

These sectors do not align neatly to the coordinate space we know and love; the nearest sector "corner" boundary to Sol is at coordinates [-65, -25, 215]. The reason for this oddity is potentially explained by knowing that Frontier internally appear to use different coordinates which put Sol at [49985, 40985, 24105]. The boundary coordinates shown earlier match a coordinate system with such an origin (e.g. 49985-65 is a multiple of 1280).

Sectors are essentially an ordered list, starting from the "bottom-left-near" corner - that is, the corner where all coordinates are smallest. The list starts from this position, and goes left-to-right (X axis), then bottom-to-top (Y axis), then near-to-far (Z axis).

Within sectors, systems will be given a unique designation, specified by the sequence of letters and numbers after the sector name.

TODO: More here I haven't thought of?

Decoding a System Name

Let's use an example: Synookio DA-Q b19-7

The sector name is Synookio. This uniquely identifies the sector of space this system is in, within the galaxy as a whole. (More on that later!)
The system identifier is DA-Q b19-7. This specifies approximately where within the Synookio sector this system is.

The System Identifiers

The form of the system identifier is usually the same: [L1][L2]-[L3] [MCode][N1]-[N2]. Example: DA-Q b19-7
The only exception is when N1 is 0 (described later), when the form becomes [L1][L2]-[L3] [MCode][N2]. Example: CL-Y d5

L1, L2 and L3 are always single uppercase letters. MCode is always a single lowercase letter between 'a' and 'h', and N1 and N2 are numbers (of arbitrary size). Note that in dense sectors, N2 can get to very high numbers (into the tens of thousands in the galactic core).

Boxel sizes by MCode
a 10Ly (2M/sector)
b 20Ly (262k/sector)
c 40Ly (32k/sector)
d 80Ly (4k/sector)
e 160Ly (512/sector)
f 320Ly (64/sector)
g 640Ly (8/sector)
h 1280Ly (1/sector)

The 1280x1280x1280Ly sector is split into sub-cubes, or boxels (as named by Vitamin Arrr). The MCode determines how large those boxels are; the table to the right shows the exact figures. For a given MCode, the boxels are essentially an ordered list of cubes, behaving exactly the same as sectors do. However, there is one slightly confusing aspect to this: for any given sector, the grid of boxels is always 128x128x128 in size, regardless of how large those boxels actually are. This can result in some combinations being completely invalid due to the boxel position actually being outside the current sector. This is, fortunately, easily detectable.

For a given MCode 'n', the bottom-left-near sector is AA-A n0-?, where ? would be the N2 number for each individual star within the box (the N2 is not relevant to decoding the position). Note that as described earlier, this would actually appear in-game as "AA-A n?".

From here, we move left-to-right, incrementing the pattern starting with L1: AA-A n, BA-A n, CA-A n, ...
When reaching the far edge 128 boxels later at XE-A n, we move up a level and start again from the left with YE-A n. Similarly, when reaching the end of the top stack, we move "out" one Z-axis row and start from the bottom-left again.

The pattern increments each letter while it can, in turn: AA-A, BA-A, CA-A, ..., YA-A, ZA-A, AB-A, BB-A, ..., YZ-A, ZZ-A, AA-B, BA-B, ...
When it runs out of letters completely, N1 finally comes into play: YZ-Z n0-?, ZZ-Z n0-?, AA-A n1-?, BA-A n1-?, ...

Due to the wildly different numbers of sectors in each mass code, the maximum valid N1 value can be very different: for a mass code of 'a' N1 can go as high as 119, whereas for 'g' and 'h' it can never be anything other than 0.

Reference code for calculating the within-sector offset of a given system identifier can be found in the _get_relpos_from_sysid function in pgnames.py.

TODO: Diagrams, clearer explanation

The Sector Names

As previously mentioned, each sector has a unique name; there are two forms of these names: those with one word (e.g. Synookio), and those with two words (e.g. Dryau Aowsy). From here on, one-word names will be called class 1 names and two-word names will be class 2.

Both classes of name are made up of a common set of "phonemes", or word fragments. These are made up of several types of fragment: prefixes, infixes and suffixes. Infixes are only used by class 1 names; prefixes and suffixes are used by both.

Class 1 names can follow two forms:
Prefix-infix-suffix (e.g. Lysoorb, Vigs, Tzaiwns)
Prefix-infix-infix-suffix (e.g. Synookio, Aucoks, Stuemeae)

Class 2 names always follow the same form:
Prefix-suffix Prefix-suffix (e.g. Dryau Aowsy, Blu Ain, Phaa Aub)

The full list of fragments can be found in an appendix, but in short: there are a lot of them, and the order of the lists is very important.

The two naming schemes are both essentially a linear sequence, in a very similar manner to the system identifiers. The schemes start at the bottom left and moves right along the X axis, then up the Y axis, then "out" along the Z axis. Both schemes start from the bottom-left of the galaxy and cover every sector.

What this means is that, importantly, every sector has a name from both the class 1 and class 2 schemes, even though only one of them will eventually appear in the game. For example, Synookio is also Blu Aewsy, and Dryau Aowsy is also Whambua. The decision for which name to use is taken by a hash function, which is described later.

==

The Rest of The Content

Appendices

Full list of phonemes

Prefixes

Th, Eo, Oo, Eu, Tr, Sly, Dry, Ou, Tz, Phl, Ae, Sch, Hyp, Syst, Ai, Kyl,
Phr, Eae, Ph, Fl, Ao, Scr, Shr, Fly, Pl, Fr, Au, Pry, Pr, Hyph, Py, Chr,
Phyl, Tyr, Bl, Cry, Gl, Br, Gr, By, Aae, Myc, Gyr, Ly, Myl, Lych, Myn, Ch,
Myr, Cl, Rh, Wh, Pyr, Cr, Syn, Str, Syr, Cy, Wr, Hy, My, Sty, Sc, Sph,
Spl, A, Sh, B, C, D, Sk, Io, Dr, E, Sl, F, Sm, G, H, I,
Sp, J, Sq, K, L, Pyth, M, St, N, O, Ny, Lyr, P, Sw, Thr, Lys,
Q, R, S, T, Ea, U, V, W, Schr, X, Ee, Y, Z, Ei, Oe

Infixes (sequence 1)

o, ai, a, oi, ea, ie, u, e, ee, oo, ue, i, oa, au, ae, oe

Infixes (sequence 2)

ll, ss, b, c, d, f, dg, g, ng, h, j, k, l, m, n, mb,
p, q, gn, th, r, s, t, ch, tch, v, w, wh, ck, x, y, z,
ph, sh, ct, wr

Suffixes (sequence 1)

oe, io, oea, oi, aa, ua, eia, ae, ooe, oo, a, ue, ai, e, iae, oae,
ou, uae, i, ao, au, o, eae, u, aea, ia, ie, eou, aei, ea, uia, oa,
aae, eau, ee

Suffixes (sequence 2)

Note: class 2 names only use the first 35, up to wyg.

b, scs, wsy, c, d, vsky, f, sms, dst, g, rb, h, nts, ch, rd, rld,
k, lls, ck, rgh, l, rg, m, n, hm, p, hn, rk, q, rl, r, rm,
s, cs, wyg, rn, ct, t, hs, rbs, rp, tts, v, wn, ms, w, rr, mt,
x, rs, cy, y, rt, z, ws, lch, my, ry, nks, nd, sc, ng, sh, nk,
sk, nn, ds, sm, sp, ns, nt, dy, ss, st, rrs, xt, nz, sy, xy, rsch,
rphs, sts, sys, sty, th, tl, tls, rds, nch, rns, ts, wls, rnt, tt, rdy, rst,
pps, tz, tch, sks, ppy, ff, sps, kh, sky, ph, lts, wnst, rth, ths, fs, pp,
ft, ks, pr, ps, pt, fy, rts, ky, rshch, mly, py, bb, nds, wry, zz, nns,
ld, lf, gh, lks, sly, lk, ll, rph, ln, bs, rsts, gs, ls, vvy, lt, rks,
qs, rps, gy, wns, lz, nth, phs