One of my early day’s Assembly (ASM-51) codes!!!

  1 
  2 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
  3 ;       Data definitions
  4 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
  5     ;------======------======------======------======------======------======------======------======------
  6     ;       Variable definitions
  7     ;------======------======------======------======------======------======------======------======------     
  8         IR_BUF          EQU     0x80        ; Buffer for sample store
  9         IR_DR_L         EQU     0x7E        ; Buffer Low for storing the smallest bit duration
 10         IR_DR_H         EQU     0x7F        ; Buffer High for storing the smallest bit duration
 11         IR_GEN_BUF      EQU     0x7D        ; General purpose buffer, _NOT_STATIC_
 12         IR_ERR          EQU     0x7C        ; IR input error counter
 13         
 14     ;------======------======------======------======------======------======------======------======------
 15     ;       Value definitions
 16     ;------======------======------======------======------======------======------======------======------     
 17         IR_CTR          EQU     0x40        ; Count for sample store
 18         CRLF_CTR        EQU     0x10        ; Count for sending a CRLF
 19         TMR_SMPL_1      EQU     0xA2        ; Timer reload value (approx. 100uS)
 20         TMR_SMPL        EQU     0x47        ; Timer reload value (approx. 200uS)
 21         
 22     ;------======------======------======------======------======------======------======------======------
 23     ;       Pin & Register definitions
 24     ;------======------======------======------======------======------======------======------======------     
 25         IR_IN           EQU     P3.2        ; IR signal input pin
 26         
 27 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
 28 
 29 
 30 
 31 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
 32 ;       Code starts here
 33 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
 34         org     0000h
 35         jmp     reset_rtn
 36         
 37         org     0003h
 38         jmp     ie0_rtn
 39 ;-------=========-------=========-------=========-------=========-------=========-------=========-------=========-------
 40 
 41     ;------======------======------======------======------======------======------======------======------
 42     ;       Macro definitions perticular to this file
 43     ;------======------======------======------======------======------======------======------======------
 44     MACRO   @RST_TMR0_00
 45         clr     TR0                     ; Stop the Timer 0
 46         mov     TH0,#0x00               ; Reset high byte of Timer 0
 47         mov     TL0,#0x04               ; Compensate the cycles
 48         setb    TR0                     ; Kick timer 0 off again
 49     ENDMAC
 50     
 51     MACRO   @RESET_TR0_TO
 52         clr     TR0                     ; Stop the Timer 0
 53         mov     TH0,|1                  ; Reset high byte of Timer 0
 54         mov     A,#0x06                 ; Compensate the cycles
 55         add     A,|2                    ; Add it to the TL0 value
 56         mov     TL0,A                   ; Move it to TL0 and 
 57         setb    TR0                     ; Kick timer 0 off again
 58     ENDMAC
 59     
 60     MACRO   @SEND_CHAR
 61         jnb     TI,$                    ; Wait for the last transmission is over
 62         clr     TI                      ; For new transmission
 63         mov     SBUF,|1
 64     ENDMAC
 65 
 66 ;===============================================================================================================
 67 ;       External Interrupt handling routine, IR signal is fed here
 68 ;===============================================================================================================
 69 ie0_rtn:
 70         push    ACC
 71         clr     EX0                     ; Disable EX0 first
 72     ;------======------======------======------======------======------======------======------======------
 73     ;       Enter external interrupt 0 service routine
 74     ;------======------======------======------======------======------======------======------======------
 75         clr     TR0                     ; Stop the Timer 0
 76         mov     TH0,#0x00               ; Reset high byte of Timer 0
 77         mov     TL0,#0x03               ; Reset low byte of Timer 0
 78         setb    TR0                     ; Kick timer 0 off
 79 
 80         mov     R0,#IR_BUF              ; Pointer for sample store buffer
 81         mov     R1,#IR_BUF              ; Pointer for sample send buffer
 82         mov     R3,#IR_CTR              ; Counter for sample store buffer
 83         mov     R4,#0x02                ; Counter for sample send buffer        
 84         mov     R5,#CRLF_CTR            ; Counter for sending a '\n' during dump
 85         
 86     ;------======------======------======------======------======------======------======------======------
 87     ;       Sample and store the incoming start bit
 88     ;------======------======------======------======------======------======------======------======------
 89         jnb     IR_IN,$                 ; Start sampling - Bit 0
 90         clr     TR0                     ; Stop timer 0 to note values
 91         mov     @R0,TH0                 ; Store high byte first
 92         inc     R0                      ; Point to next buffer slot
 93         mov     @R0,TL0                 ; Store low byte next
 94         inc     R0                      ; Point to next buffer slot
 95         clr     TF0                     ; Clear to make sure only timeout triggers error
 96         dec     R3                      ; For writes above
 97         dec     R3                      ; For writes above
 98         @RST_TMR0_00                    ; Reset the timer to 0x0000 before continueing      
 99 smpl_sync:  
100         jb      TF0,smpl_exit           ; Error, jump to handle the error
101         jnb     IR_IN,smpl_sync         ; If the line is low for a while, timeout occurs
102         clr     A                       ; To measure the in bit
103 
104     ;------======------======------======------======------======------======------======------======------
105     ;       Sampling loop, iterates once for every bit
106     ;------======------======------======------======------======------======------======------======------
107 smpl_lp:
108         inc     A                       ; Continue measuring the duration
109         jc      smpl_exit               ; If overflow occurs, exit the measuring loop
110         nop                             ;
111         nop                             ;
112         nop                             ;
113         nop                             ;
114         nop                             ; For reducing the count
115         jb      IR_IN,smpl_lp           ; Continue measuring the duration
116         mov     @R0,A                   ; Store the count
117         inc     R0                      ; Point to the next byte in buffer
118         inc     R4                      ; Increment bytes to send
119 smpl_chk_tmout: 
120         inc     A
121         jc      smpl_err                ; If overflow occurs, Error
122         nop                             ;
123         nop                             ;
124         nop                             ;
125         nop                             ;
126         nop                             ; For reducing the count
127         jnb     IR_IN,smpl_chk_tmout    ; Continue measuring the duration
128         clr     A                       ; Make sure to measure from 0x00
129         djnz    R3,smpl_lp              ; Check for buffer over flow
130         
131     ;------======------======------======------======------======------======------======------======------
132     ;       Exit sampling loop, start sending the byte values over UART @ (1,19200,N,1,NoFlow)
133     ;------======------======------======------======------======------======------======------======------ 
134 smpl_exit:
135         clr     TF0                     ; Ensure overflow occurs only after timeout
136         @RESET_TR0_TO   #0x20,#0x00     ; For a little sleep
137         jnb     TF0,$                   ; Delay for the duration (not calculated)
138         clr     TR0                     ; Stop the timer
139         clr     TF0                     ; Clear the timer overflow flag
140         clr     IE0                     ; Just in case another interrupt is pending on IE0
141     ;   setb    EX0                     ; Re enable the external 0 interrupt (May this idea work!!!)
142         pop     ACC                     ;
143         reti
144 
145 smpl_err:   
146         @SEND_CHAR  #'R'                ; Indicate error to monitor
147         dec     IR_ERR                  ; Increment the IR Error counter
148         pop     ACC                     ;
149         reti
150         
151 ;------======------======------======------======------======------======------======------======------======------
152 ;                       Exit external interrupt 0 ISR
153 ;------======------======------======------======------======------======------======------======------======------
154 
155 
156 ;------======------======------======------======------======------======------======------======------======------
157 ;                       Enter the reset routine and main loop
158 ;------======------======------======------======------======------======------======------======------======------
159 reset_rtn:
160     ;------======------======------======------======------======------======------======------======------
161     ;       Peripheral initialization routines
162     ;------======------======------======------======------======------======------======------======------
163         mov     SP,#0xE0                ; Allocating 32 bytes for stack
164         orl     PCON,#0x80              ; Set SMOD of PCON, Enable double baud rate
165         anl     TMOD,#0xAF              ; Clear C/T, M0 for Timer1 of TMOD
166         mov     TMOD,#0x21              ; Set M1 for Timers 0 & 1 of TMOD
167                                         ; Set Timer 1 to Mode 2 (8-bit auto reload) for Baud Rate Generation
168                                         ; Timer 0 is in Mode 1 (16 bit timer mode)
169         mov     TH1,#0xFD               ; Set Baud Rate to 9600 bps
170         clr     SM0                     ; Clear SM0 of SCON
171         setb    SM1                     ; Set SM1 of SCON
172                                         ; Set UART to Mode 1 (8-bit UART)
173         setb    REN                     ; Set REN of SCON to Enable UART Receive
174         setb    TR1                     ; Set TR1 of TCON to Start Timer1
175         setb    TI                      ; Set TI of SCON to Get Ready to Send
176         clr     RI                      ; Clear RI of SCON to Get Ready to Receive
177         
178         setb    IT0                     ; Make interrupt edge triggered     
179 ;//     setb    EX0                     ; Enable external interrupt0
180         setb    EA                      ; Enable all interrupts
181         mov     R7,#0x14                ; Implement a 1 sec delay during start up
182 sec_delay:  
183         clr     TR0                     ; Stop timer 0
184         mov     TH0,#0x3C               ; Set timer 0 for 20mS delay (not accurate)
185         mov     TL0,#0xB3               ; Compensate for load cycles (4)
186         setb    TR0                     ; Kick timer off for running
187         jnb     TF0,$                   ; Wait for timer overflow
188         clr     TF0                     ; Clear the timer interrupt
189         djnz    R7,sec_delay            ; If not 20 times, continue the dealy loop
190 
191         setb        EX0                 ; Ensure EX0 enabled before entering main loop
192     ;------======------======------======------======------======------======------======------======------
193     ;       Main loop starts here, infinite loop
194     ;------======------======------======------======------======------======------======------======------
195 main_lp:
196         jnb     RI,ir_rx_chk            ; For PC control, through serial port
197         clr     RI                      ; If received a control byte, echo it first
198         jnb     TI,$                    ; Is a tranmit in progress?
199         clr     TI                      ; To indicate next transmission
200         mov     A,SBUF                  ; Load the serial in byte in here
201         mov     SBUF,A                  ; Echo back
202         cjne    A,#'m',ir_rx_chk        ; If asking for a memory dump
203         call    mem_dump                ; Dump the memory to the serial terminal
204 ir_rx_chk:
205         jb      EX0,main_lp             ; Wait for the IR signal to be decoded
206 
207 
208     ;------======------======------======------======------======------======------======------======------
209     ;   //\\//\\ Debug Section Send bytes: for debug printing only 17th to 20th byte //\\//\\
210     ;------======------======------======------======------======------======------======------======------
211 ;       mov     A,R1                    ; Load the current send buffer pointer
212 ;       add     A,#0x26                 ; Offset by 38 to start with 38th byte
213 ;       mov     R1,A                    ; Set the buffer pointer to 17th byte
214 ;       mov     R4,#0x04                ; Set counter for sending only 4 bytes
215     ;------======------======------======------======------======------======------======------======------
216     ;   //\\//\\ End Section Send bytes: for debug printing only 17th to 20th byte //\\//\\
217     ;------======------======------======------======------======------======------======------======------
218 
219     ;------======------======------======------======------======------======------======------======------
220     ;       Send a CR/LF feed before start
221     ;------======------======------======------======------======------======------======------======------
222         jnb     TI,$                    ; Wait for the last tranmission over
223         clr     TI                      ; Clear the transmit flag
224         mov     SBUF,#0x0D              ; Send a carriage return
225         jnb     TI,$                    ; Wait for the last tranmission over
226         clr     TI                      ; Clear the transmit flag
227         mov     SBUF,#0x0A              ; Send a line feed
228 
229 send_lp:
230         mov     A,@R1                   ; Load byte from buffer
231         call    hex_to_uart             ; Send the hex value of the byte on serial port
232         mov     @R1,#0xFF               ; Reset the location
233         inc     R1                      ; Increment the send pointer
234         djnz    R4,send_lp              ; Decrement the send counter (<<)
235  ;\\        djnz        R5,send_continue        ; Decrement the '\n' counter (>>)
236     ;------======------======------======------======------======------======------======------======------
237     ;       Send a line feed after All digits (previously (12x3) digits (>>))
238     ;------======------======------======------======------======------======------======------======------
239         jnb     TI,$                    ; Wait for the last tranmission over
240         clr     TI                      ; Clear the transmit flag
241         mov     SBUF,#0x0D              ; Send a carriage return
242         jnb     TI,$                    ; Wait for the last tranmission over
243         clr     TI                      ; Clear the transmit flag
244         mov     SBUF,#0x0A              ; Send a line feed
245         jnb     TI,$                    ; Wait for the last tranmission over
246         clr     TI                      ; Clear the transmit flag
247         mov     SBUF,#0x0A              ; Send another line feed
248         
249 ;\\         mov     R5,#CRLF_CTR        ; Reset line feed counter (>>)
250 ;\\send_continue:                       ; (>>)
251 ;\\     djnz        R4,send_lp          ; Decrement the send counter(>>)
252         
253     ;------======------======------======------======------======------======------======------======------
254     ;       Sent all bytes, continue from beginning
255     ;------======------======------======------======------======------======------======------======------
256         setb    EX0                     ; Start the next receive cycle
257         jmp     main_lp                 ; Continue forever
258         
259 ;------======------======------======------======------======------======------======------======------======------
260 ;                   (Never)Exit the reset routine and main loop
261 ;------======------======------======------======------======------======------======------======------======------
262 
263 ;------======------======------======------======------======------======------======------======------======------
264 ;                   Put hex value on serial port
265 ;------======------======------======------======------======------======------======------======------======------
266 hex_to_uart:
267     ;------======------======------======------======------======------======------======------======------
268     ;       Send High byte in ASCII first
269     ;------======------======------======------======------======------======------======------======------
270         mov     IR_GEN_BUF,A            ; Store the byte temporarily in buffer
271         swap    A                       ; Take the high byte
272         anl     A,#0x0F                 ; Mask the low byte
273         cjne    A,#0x0A,high_is_char    ; If hex digit is less than 0xA
274   high_is_char:
275         jc      high_dont_add           ; skip adding 7 to make it ASCII alphabet
276         add     A,#0x07                 ; Make it ASCII alphabet
277   high_dont_add:
278         add     A,#0x30                 ; Convert to ASCII
279         jnb     TI,$                    ; Wait till the last transmission is over
280         clr     TI                      ; Clear the transmission flag
281         mov     SBUF,A                  ; Send it
282  
283     ;------======------======------======------======------======------======------======------======------
284     ;       Send Low byte in ASCII next
285     ;------======------======------======------======------======------======------======------======------
286         mov     A,IR_GEN_BUF            ; Restore the byte from buffer
287         anl     A,#0x0f                 ; Mask the high byte
288         cjne    A,#0x0A,low_is_char     ; If hex digit is less than 0xA
289   low_is_char:
290         jc      low_dont_add            ; skip adding 7 to make it ASCII alphabet
291         add     A,#0x07                 ; Make it ASCII alphabet
292   low_dont_add:
293         add     A,#0x30                 ; Convert to ASCII
294         jnb     TI,$                    ; Wait till the last transmission is over
295         clr     TI                      ; Clear the transmission flag
296         mov     SBUF,A                  ; Send it
297   
298     ;------======------======------======------======------======------======------======------======------
299     ;       Send a space after each digit
300     ;------======------======------======------======------======------======------======------======------
301         jnb     TI,$                    ; Wait for the last tranmission over
302         clr     TI                      ; Clear the transmit flag
303         mov     SBUF,#0x20              ; Send a white space charecter
304         ret
305 
306 ;------======------======------======------======------======------======------======------======------======------
307 ;               Dump memory to UART in hex
308 ;------======------======------======------======------======------======------======------======------======------
309 mem_dump:
310     ;------======------======------======------======------======------======------======------======------
311     ;       Loop till all bytes sent
312     ;------======------======------======------======------======------======------======------======------
313         clr     EA                      ; Disable all interrupts
314         mov     IR_BUF,R1               ; Save the pointer's contents
315         mov     R1,#0xFF                ; Start with location 0xFF
316 mem_dump_loop:
317         mov     A,@R1                   ; Load the contents into A
318         call    hex_to_uart             ; And send the hex value
319         djnz    R1,mem_dump_loop        ; Continue till all bytes sent
320         mov     A,@R1                   ; For location 0x00
321         call    hex_to_uart             ; Send it
322         mov     R1,IR_BUF               ; Save the pointer's contents
323         setb    EA                      ; Enable all interrupts
324         ret

A Battery Health Check (HW)

battery_monitorThe goal is to develop a battery health-check system in 40 hours or less. This includes learning a couple of new skills, defining the target design, designing hardware, software and test it.

So far, I learned KiCAD. Prototyped the analog on breadboard, came up with this design and created this schematic in the KiCAD. It took 16 ~ 18 hours of work till now. I had to create schematic library packages for the PIC16F883 microcontroller, BTW.

I used to draw schematics and create PCBs in CirCAD. Since it is commercial package, costly and really costly ($$$$), I needed an alternate package for the moment. I explored a few options and found KiCAD doing the job, at least for now.

The Goal:

I wanted to come up with a battery monitoring system to collect a battery’s charge discharge data for measuring and plotting it’s performance. The system should charge the battery at a defined current – normally fixed by the charger. Then goes over a discharge cycle with a predetermined load (e.g 0.1C), collecting the performance characteristics such as rate of voltage drop, actual current delivered, discharge cycle time and total cycle time. Then the cycle is repeated with different loads (e.g. 0.2C load, 0.3C,…,2C). The number of steps (load cycles and load values) should be selectable in software. The results should be logged to a permanent storage for import. The imported data can be used to measure, plot and graph the battery characteristics, thus giving the performance and life expectancy under different load conditions.

About the Circuit:

It consists of four modules: the analog frontend, the charge/discharge controller, the microcontroller and the display. The long term storage (using EN25F80) is still pending.

The analog frontend takes three lines from battery – the battery’s +ve & -ve terminals directly and a third lead that connects the battery to load or charger. The charge/discharge controller is built around IC ULN2003 with a total of five relays – one for charger and four for load. Each realy on the load will carry a specific load attached to it – I chose 16Amps, 8Amps, 4Amps & 2Amps. This allows me to discharge the battery under test (BUT or DUT) at a rate anywhere from 2A to 30A in 2A steps.

More to come soon…

Note: All the work shared on this page is free for use under following terms: all shares/distributions including derived works should refer a link to this page and you use this information at your own risk. I am not responsible for any results including success and/or damages.

Why should Hard Disks have only one set of R/W Heads – Arm?

I just started wondering, why should hard disks have only one arm for R/W heads. I did not try doing much research yet, but I think the technology today is mature enough to have more than one arm, thus improving the disk response.

One obvious challenge is the chances of disk crash will be doubled if we have two arms. Probably, the life expectancy also is cut in to half. But can the technology today take care of these challenges? Is it worth it? I am still trying to dig.
More to come…

DAS Stack: Let’s continue the journey – About the “Disk”

From a typical internal construction perspective, we can say the disks can be broadly classified as three types. The physical hard disk, a flash based Solid State Disk (SSD) and virtual disk from Storage Arrays (Network Disks).

The physical hard disks are usually made up of a hard metal disk (usually aluminum) coated with a magnetic material. The magnetic material records the data and the aluminum disk provides the rigid support.

The diagram below shows a quick view of insides of a typical hard disk drive.

More detail at http://news.bbc.co.uk/2/hi/technology/6677545.stm

Let’s quickly touch a few ares in a word or two. For further information http://en.wikipedia.org/wiki/Hard_disk_drive is an excellent source of detailed information.

The disk or platters here is held and rotated by the spindle. The Read/Write head moves about the surface of the disk and, of course, reads or writes the disk as it rotates. The head arm holds the Read/Write head(s) and is positioned by the voice coil actuator. The air-filter filters out the dust etc. from entering the hard drive compartment.

Just for notes, the head “flies” above the disk with a very thin gap between the head and the disk. If it ever touches the disk during spin, the head “crashes” and the disk is essentially useless. There are some data recovery technologies which can try and recover the data that is not on the crash path. However, as far I know, it is not practical to extract data under the tracks that is actually in the crash zone.

Disk Block Layout

This diagram shows a rough view of Disk Layout. The disk itself if split in to tracks and sectors. There are two possibilities with this kind of arrangement, as you might guess or know. The first is variable track length (or circumference, if you prefer). This is a typical concentric circle division. The center circles are smaller and hence will have smaller circumference and the outer circles are larger having greater circumference.

The other is fixed track length. This can normally be achieved by spirals instead of concentric circles. The track is of fixed length. Near the center of the disk, the circle can have lesser tracks than near the edge of the disk. This is more standard industry practice. The diagram below shows a typical tracks and sector division. Each sector is further divided in to blocks and we have that division on display here as well.

The block is the smallest individually addressable entity on the disk. It is usually 512 or 520 bytes, with 512 is most common. There is an advanced format proposal which makes the block size to 4096 (or 4K) bytes. The host system might need to have some support for this version, though. More on this later.

Shown at http://computershopper.com/feature/how-it-works-platter-based-hard-drive

The drive has the disk or platter(s) (1), spindle(2) to hold and spin them, the head arm (3) to hold the Read/Write heads, the voice coil (4) to move the head around, the heads (5), and the head landing zone(9) where the head can rest without crashing on the disk during power down. The tracks (6), sectors (8) and blocks (7) are the locations on the disk where data is stored and retrieved.

Typically, the disk is accessed as a set of serially numbered blocks – the Logical Block Address (LBA). The on disk electronics worry about translating disk LBA in to a location on the disk and store retrieve information. This also makes the life of host system easier as it only needs to worry about the block address as a logical block number – no need to remember on which platter, on which side, on what track in which sector is the block of our interest.

The logical block addressing also helps the on disk controller map out bad blocks without the host system worry about them. This is an unwanted behavior in some cases, though.

For the Solid State Disks, or flash drives – there are no spindles as you might guess. There are just a few chips under the hood storing and retrieving blocks of information with a bit of glue logic to attach these storage chips to the bus. The blocks can again be accessed with LBA and the controller translates that in to chip, sector/group and block address depending on flash chip organization. Let me try to touch SSDs in some more detail in a separate post. A typical small scale example of SSD is our USB pen drive or SD Card.

Storage: The DAS stack

Let’s talk about a typical Direct Attached Storage stack from a server system’s perspective.

Let’s cover the storage stack in both software and hardware layers, which can later help us bring in the networked storage concepts with ease.

Let’s take a quick look at the stack for ease of understanding.

The Storage Subsystem

The storage subsystem stack diagram has software components in blue  and hardware components in red. As with common sense, both are in just reverse order to each other. In a way, this stack might look just like networking stack. Let’s walk through the maze.

When an application requests for some operation a file, the request is passed on to the filesystem layers. Like in Linux, it’s possible to have a Virtual File System or VFS layer which will then switch to actual filesystem drivers to do the job. The filesystem then starts working through it’s magic on it’s internal data structures to figure out where and how to go about serving the request. One best example is looking up an inode and the indirect blocks. Let’s save some for detailed discussion later.

The filesystem then asks the disk accessing components, e.g. SATA or SCSI about the blocks it’s interested in. Once the request arrived in here, from now on we are only talking in terms of block numbers, be it LBA or some other mechanism. We don’t know anything about filesystem at all. Tell the block number and get it, read or write. The disk accessing subsystem then, looks up to do what best it can do, often re arranging the commands for a sequential operation improving the performance. Finally places the request packet to the HCI.

The host controller interface (HCI) or host bus adapter (HBA) in some cases such as SCSI drivers take in the packet , does the job of taking the series of commands for the disks, appropriate buffers and the disk ID as input, makes the queue up to place the commands to the disk along with disk ID. The appropriate queue is then passed on as a bunch of bytes/words to the BUS interface driver. In some cases, the HCI is rather really complex piece of software, such as for SATA, containing may layers such as transport layer and link layer within itself.

Finally the data to be sent on the bus arrives at the bus drivers. The bus drivers often are just simple ones to fiddle with a few flags and place the data on the bus, often with DMA or direct memory accessing subsystem and then let things go. When the DMA is done an IRQ comes up and informs the driver that the operation is complete. Here, the driver then returns control the HCI driver and HCI driver might wait for it’s own IRQ telling the command queue is sent successfully to the disk or just return the control to the Disk driver. Here the disk driver will have to wait for the disk job done and the IRQ tells that it’s complete. Only then the control can reach back to the filesystem telling that job was well done and the day is going good.

The skipped RAID component, let’s talk about it in a while.

Talking about the hardware, let’s start with the disks. The disks are just a few stores for the data and a few electronics glued to transfer data in and out. Often, disks take a single Logical Block Address and return the appropriate block by maintaining internal LBA to track/sector mapping. In the initial days, the disks just had a minimal electronics, but later the integrated drive electronics or IDE came in to picture. The same way, the Small Computer System Interface (SCSI) came in to existence. Though this is a dinosaur age story, that might just amuse you a bit. Though the SCSI is a common interface for the system, it’s mostly used for mass storage. Often, most storage systems emulate SCSI, such as USB pen drives. Then came along the AT Attachment or ATA (AT coming from PC-AT) and then came down the Serial ATA or SATA.

The SATA, however has it’s own protocol stack, though it supports a legacy mode in which the system can access a SATA subsystem as if it were PATA. As the SATA HCI can be operated in both backward compatibility mode with PATA and advanced HCI mode, the controller includes both standard IDE electronics & advanced HCI components.

Finally, the Bus interfaces. Most of today’s systems run on top of PCIe or Peripheral Component Interconnect – Express bus. In few words, it’s an advanced serial IO bus system that interconnects the CPU with the peripherals at a very high speed. Please visit http://en.wikipedia.org/wiki/PCI_Express for a quick look. Some posts on it a bit later.

In the next post, let’s talk about blocks, RAID and stripes.

Storage: The first words – DAS, NAS & SAN

Let’s discuss about storage for a while. No I am not talking about food store or cold storage. Let’s talk about electronic data storage.

As most of us know, the data is stored in the computer at least two places. One is temporary store and the other is permanent. The temporary store holds the data and the programming as a book opened for reading/writing. This resource is often limited in size, volatile and costs pretty high. On the other hand, the permanent store is relatively cheap, like almost the cost of MB temporary store is equivalent to cost per GB permanent store. Also the permanent store or secondary store is abundant, and non volatile, but significantly slow.

Let’s call the temporary store ‘ the RAM’, which collectively refers to the SDRAM, the SRAM, the DRAM, the L1/L2/L3 Caches and such stores. And the let’s call the permanent store ‘the disk’, which includes hard disks, optical disks and such devices.

The way information is stored in the RAM and disk is also significantly different. The RAM is often divided in to pages and segments and addressed as linear address space (by linear, I mean all logical, virtual and physical address spaces).

The way information stored on the disk is a bit different and non linear. The disk itself is organized in to tracks and sectors to store the information physically. Logically information or data is organized in files & directories. Though there are other tiny stores, it is mostly meta data, like boot records or partition information.

The way to map the tracks and sectors in to files and directories is often called filesystem. There are more than one way to deal with such mapping, hence different filesystems – ext3, FAT, reiserFS, NTFS or less known, proprietary filesystems such as WAFL.

The way to attach the tracks, sectors & disks to the computing system is also diverse, Direct Attached Storage, Network Attached Storage & Storage Area Networks. The best way to study the storage stack is to start off with Direct Attached Storage. The networked storage works inserting the network either in between application and the filesystem (NAS) or the filesystem and the disks (SAN).


DAS

Let’s talk about DAS first, hence. IMO, the easiest diagram is what we have above here. Though the diagram is just a few a few blocks, I’m also trying to talk about the same, which I’m intending to reuse again and again, at least for a while.

The application never needs to know about the underlying storage architecture. The application needs the secondary store for at least two purposes, first to get the application program instructions such as the application itself or shared libraries. The other is for storing and retrieving the user data, such as databases, documents or spreadsheets.

The moment it starts execution, it might need program instructions such as a shared library or some part of the executable file itself. For this part, the application does not even need to deal with the filesystem at all. Here the application simply calls the required routine and the operating system worries about talking to the filesystem, identifying the way through tracks and sectors maze.

However for the user data part, the application needs to worry about the filesystem, though to a minimum extent. The application needs to ask the operating system to open a given file, read from it or write to it, delete it or create it and close it when job’s done. The operating system and the underlying filesystem stuff do the dirty job, just as a clerk at the filing cabinets. You would ask for the required file, the clerk is happy to get that for you, if not available you may ask to add a new one, he opens the file reads or modifies the same as you ask for and returns it to the cabinet when done. Similar way, the application gets the file (file handle more precisely), reads, modifies the file and closes when done. All that application worries about is whether the file at given path exists, if exists can be accessed, if it can access, open, read or modify, rinse and repeat till the job is done and then close the file. It never needs to worry about what happens under the hood, it may not even worry about where the file actually is stored. At times, the application is not using the data files, instead using the database, the application simply talks to the database program, which worries about storing and retrieving data from the filesystems.

The databases, often the small scale databases, store their data, records and tables in the files on the disks. The large scale databases however, toss off the filesystem, take the tracks and sectors off the disk directly and manage them using their own filesystem. In either case, both should have some sort of filesystem under the hood to manage the tracks and sectors.

The OS, with the help of filesystem code, often in the form of filesystem drivers, translates the application requests for files in to tracks and sectors, retrieves or writes the ones required and thus manages them. How does the filesystem track and manage the tracks and sectors? How does the mapping happen between logical entities such as files & directories to physical tracks and sectors? Let’s bother about them in a while. For, now, let’s continue for the next layer.

The RAID/HBA deals with translating the operating system request in to a language that’s understood by the disks and disk interfaces. Often, they sit on top of some stacked bus interfaces such as SATA on PCI-X bus. The HBA then talks to the disks in the language of the disks, such as LBA or Logical Block Addressing.

The Device Drivers – RTOS Vs. Linux


Typical RTOS device driver model

Let’s take a quick look at a typical RTOS device driver and throw it in ring with Linux driver.

The device driver’s philosophy
From a general perspective, the device driver, usually is a set of routines enabling typical applications to talk to the hardware. In other words, often, the device drivers are translators, translating applications language into hardware language. The device drivers also enable application programs to ignore hardware dependencies and focus on the actual job in hands.

The RTOS device driver
The device drivers for RTOS, as shown in the diagram, usually a bit different from a GPOS or firmware device drivers. The device drivers can be broadly classified in to three types, OS less drivers, GPOS drivers and RTOS drivers.

OS less drivers usually are just a bunch of routines enabling applications without OS to do their job. Usually such applications are bootloaders, firmware boot up routines or simple systems with round robbin applications.

The device drivers for GPOS, for the most part, are also a bunch of device access routines. However, the key difference between drivers for OS less systems and drivers for GPOS systems is lying in the driver operation concurrancy, entry & exit points. More frequently, if not always, the drivers should also deal with security stuff.

However, the RTOS drivers themselves make their home in an entirely different world. The drivers should follow the same basic principles and the philosophy of the RTOS itself. The driver execution should be predictable and also should enable the whole system to be predictable and fault tolerant.

Let’s consider some scenario to picture the above two phrases. When we open a file in, say the Windows explorer, from a DVD ROM. The read encoutnered some error, causing the hardware & software to enter a retry loop. From the explorer, under normal scenario, one cannot end that loop, by a key stroke or even from task manager. Under Linux, the only way to end that loop, only the software loop, is to kill the process. However, the driver still continues to retry till the given number of retries are exhausted and only then it tries to return error code to the process, only to know that the process was long gone.

Under a typical RTOS scenario, the driver starts executing the reads and enters the failure loop. The difference starts here. Both the driver and the application start the operations with predefined timeout values. Hence, when a command is issued, say by the driver to the hardware, the driver waits for, say, the ISR, the driver waits for a timeout period. Hence, if the driver did not get response from the hardware within given time, the driver can choose to return error to the application indicating it needs more time to do the job.

On the other hand, the drivers for most monolithic operating systems or GPOS’ are mostly a bunch or routines with data structures tied to the calling process or a file descriptor’s context. Hence, the executing driver code contends for the hardware at the lowest level in the driver code if executed by multiple processes and the first one wins, while others enter sleep on the hardware lock. Thus, the calling process, if it did not get the lock, has to sleep/block. In some cases, the driver can also export a little framework to return immediately registering the request and respond later with the data, e.g. async file IO. Though, this allows the application process to be a bit more predictable at the execution, the application still lacks complete predictability. However, the driver is a lot more scalable with high peformance on the more capable hardware. Some other topic on that again.

The RTOS driver, on the other hand, for the most part runs it’s own thread managing the requests. A typical driver implementation is depicted by the picture above. Let’s see what makes this driver under RTOS more predictable.

The driver thread maintains a message queue. The application thread sends a request message along with it’s own queue which should receive the response. The request message contains the necessary data for the driver to execute the request, say read location or write pixel/framebuffer. Thus, once the application sends the request, it just registered it’s request with the driver and is free to do other stuff till it gets it’s response back. Now, the application can sleep on the response queue with a timeout. Thus, the CPU is free to do other jobs instead of either continuously polling the queue or waiting indefinitely. If the application is back in execution and found that it did not receive the response from the driver yet, it may choose to do some other stuff, such as informing the user or taking other corrective actions.

What makes the driver predictable is not much different from the above scenario. The driver picks up the first request form the message queue and starts executing the job. It issues the request to the hardware and waits for the response, on a semaphore with timeout, shared between driver and ISR. The hardware, after the job is done, responds by raising the interrrupt causing the CPU to start the ISR. The ISR in turn releases the semaphore, thus waking the driver process up again. The driver process, then, responds with the result to the application. If there was an error, the hardware did not respond in time back. The driver wakes up because of timeout error, figuring out that the hardware failed to do the job in time. Thus, the driver responds with an error to the application and issues an abort to the hardware for the current job. The application, can then choose whether to retry or not, by sending another request again. However, for occassional realtime systems, the driver tries itself for a couple of times before giving up. The application, upon deciding to abort the curret operation, issues a high priority message to the driver process, causing it to process the one immediately, like a backdoor, thus aborting the current operation.

The facilities of priorities, such as task priorities and timeouts against sleeping contexts such as semaphore wait enable the designer to design a system, highly predictable. Often, the life support systems are some real time systems with redundancies built in. That means, if a hardware fails to do it’s job in time, the driver immediately turns the device off bringing the system to a known state and picks the secondary hardware to do the job. The same is the case with mission critical systems, be the nuclear control systems or avionics. As it shows, handling redundancies is another speciality for a RTOS device driver.

However, the message queue and the serializaiton of the code with priorities often tend to limit the scalability of the systems.

For high performance hardware which may tend to take in multiple commands at once and respond with results one by one such as DMA controllers, instead of releasing a semaphore, the ISR sends a message to the same driver queue with results read from the hardware. The driver, when gets a chance to process the response message, looks at the context stored by the ISR and figures out the message from application, thus responding with appropriate result.

That closes the topic for RTOS drivers for now. I’ll try to throw some light  at tasks and task priorities some time.

%d bloggers like this: