Data Management   ...



Data sizes, in invocations, are free and fully independent of the declared sizes:



[Bytes_data: B§ 1 2 2  Data_Len: len]                ; (len = 3)


         

mov eax Bytes_data          ; store data address in eax (the ADDRESS, not the value)

mov eax D$Bytes_data ; store 03020201h in eax despite the bytes Declaration     

mov al  B$Bytes_data        ; store 01 in al

 

Here is an example of things this feature makes possible: for Win32 mouse input, 'Lparam' mouse coordinates message is X/Y in one only dWord, but to work with it we need having them as two separate dWords. C people do that by calling a special function to get them. In other assemblers you would usually do:


DD  MouseX: 0   

DD  MouseY: 0


            >> mov eax, Dword Ptr[lParam]

                 and eax, 0FFFFh

                   mov Dword Ptr[MouseX], eax 

            >> mov eax, Dword Ptr[lParam]

                  shr eax, 16

                    mov Dword Ptr[MouseY], eax



With RosAsm, you can do:


            [MouseX: 0   MouseY: 0]


             > push D§Lparam | pop W§MouseX, W§MouseY   ; addressable as dWords of course:


             > ...

             > mov eax D§MouseX | mov ebx D§MouseY

             > ...



(In fact, all C-like assemblers can do it too. But they are so procedural that most programmers forget how simple assembly is).



Text declarations do not absolutely require B§ (or B$) specifier (you can use it, of course):


          [Val1: 264000

             Some_French_Text: ''C'est si bon... '' 0

               Val2: 0FFFFFFFF]

          

Good: Val1 and Val2 are dWords, text is bytes (note: '0' is a dWord too...)




Data sets default begin on a dWord boundaries, if not otherwise specified. This alignment is done in a data section each time you open a new data bracket. A clean data declaration writing should always begin with bigger sized values and end with the shorter ones.  Dirty:


[Val1: B$ 1     Val2: W$ 2     Val3: D$ 3]


Evocations of 'Val2' or 'Val3' in code will require two memory accesses instead of one, at run time. 


Another consequence of data alignment is that you should avoid access in one  operation to differently aligned data sets. Example:

          

[Val1: B$ 1]  [Val2: W$ 2]  [Val3: D$ 3]    ; are now well aligned but:


mov esi, Val1 | lodsb                        ; do this or that with al = 1

                        lodsw                        ; do what you can with ax = 0


When you do not want alignment between some data, simply group them in one single Bracket set. If, for some good or bad reason, you want this feature off, you can force unaligned sets this way:


[OneSet: B§ 12 24 32]

[<AnotherSet: 1 2]


'[<' prevents 'AnotherSet' from being aligned, so that:


mov eax D§OneSet+3    ; >>>  eax = 1



This '<' Alignment sign can be used for custom alignments . This is particularly  useful for alignments bigger than dWords (as required, for example, with new instructions applying on qWords):


[<16  OneQwordSet:  Q§ 12  24  ]


The  boundary number may be given in Decimal, Hexa or Binary as well.


When you do not define any Alignment, RosAsm can Align Your Data on their own Boundaries, if the first Data in the Data Set is given with a Size Specifier. Example:


[ByteOne: B$ 1]

[ByteTwo: B$ 2]

[ByteThree: B$ 3]

[ByteFour: B$ 4]


In Such case, the Data Parser aligns the Data on their own Boundaries (instead of Four Bytes):


Bytes for Bytes, two Bytes for dWords, four Bytes for dWords, for Unicode and for Float, eight Bytes for qWords and Double, 16 Bytes, for Extended Math.



In common applications writing, as long as you don't need so much size optimization (who cares  if 100 bytes are stored more or less in Data nowadays?) you may do without any $ize Specifier in most data declarations, as long as you don't want to access several values in one same operation (like REP LODSB).


Data  alignment is a point that drives RosAsm a bit out of 'low level assembler': it is done 'behind your back'. Well,  I choose to do it because inside a PE file, in any case, the data area cannot be controlled by programmers: It is a section apart of the code section, and writing data inside a code section (it is possible) just slows down accesses...In most Win32 Structures Managements, dWord alignment is absolutely required.



So, remember, each time you open a new data bracket, you tell the assembler all of the following things:


'[' >>>


          - Open a data set (if Data:)

          - Align data on dWord boundary (or not if '[<' or otherwise if size defined)

          - Set Len to zero

          - Set Type value to dWord


Data Alignment can be defined after the '[<' Chars:


[<0 MyData: 025]      ; same as [<MyData: 025]

[<4 MyData: 025]      ; same as [MyData: 025]

[<010 MyData: 025]      ; Aligned on 16 bytes  boundary

[<0010_0000 MyData: 025]     ; Aligned on 32 bytes  boundary



Data are stored in your writing order. This implies that you can, for example, erase several consecutive data sets at once:


[FirstSet: B$ 0 #100]


[SecondSet: B$ 0 #100]


... mov edi FirstSet, al 0, ecx 200 | rep stosb 


Everything is OK, even if some code stands between the two declarations. But take care that:


[FirstSet: B$ 0 #100] ; '0' <<<<<<<<<<<<<<<< Data!!!


[SecondSet: B§ ? #100]  ; '?' <<<<<<<<<<<<<<<< Virtual Data!!!


... mov edi FirstSet, al 0, ecx 200 | rep stosb 


Would not mean a thing. the SecondSet being in Virtual Data is not at all after FirstSet (Data and Virtual Data are two separate Sections in the PEs).


~~~~~~~