Final updates on A64-OLinuXino GMAC and eMMC, we are ready to launch production

A64-OLinuXino-1

We complete our test with Rev.B

Good news is that Gigabit interface works well with Micrel/Microchip PHY and result is real Gigabit bandwidth. A20 although having Gigabit interface can’t make more than 700 Mbit I guess this is related to A20 capability to handle the data from GMAC. With A64 the speed is ¬†932Mbit i.e. very close to 1Gb:

root@A64-OLinuXino:~# iperf -s 
 ------------------------------------------------------------ 
 Server listening on TCP port 5001 
 TCP window size: 85.3 KByte (default) 
 ------------------------------------------------------------ 
 [  4] local 10.0.0.4 port 5001 connected with 10.0.0.1 port 41144 
 [ ID] Interval       Transfer     Bandwidth 
 [  4]  0.0-10.0 sec  1.09 GBytes   932 Mbits/sec

 

For eMMC we followed the advice to make it dual voltage 3.3V and 1.8V with aim to have faster transfers and we implemented it in the hardware, but the tests show that transfer is same even at 1.8V is a bit lower. I don’t know if this is due to lame software settings we do in the eMMC drivers, or just the eMMC we use have same transfer on both voltages (we check datasheet and the eMMC we use have same speed quoted on both voltages), so this may be useless for our eMMC chip:

eMMC clock: 52 Mhz

eMMC@3.3V 
root@A64-OLinuXino:/home/olimex# dd if=/dev/zero of=/mnt/output conv=fdatasync bs=384k count=1k; rm -f /mnt/output 
1024+0 records in 
1024+0 records out 
402653184 bytes (403 MB, 384 MiB) copied, 33.0437 s, 12.2 MB/s 
 
eMMC@1.8V 
root@A64-OLinuXino:/home/olimex# dd if=/dev/zero of=/mnt/output conv=fdatasync bs=384k count=1k; rm -f /mnt/output 
1024+0 records in 
1024+0 records out 
402653184 bytes (403 MB, 384 MiB) copied, 37.9408 s, 10.6 MB/s 
 
SDMMC clock: 40MHz 
 
SDMMC@3.3V 
root@A64-OLinuXino:/home/olimex# dd if=/dev/zero of=/tmp/output conv=fdatasync bs=384k count=1k; rm -f /tmp/output 
1024+0 records in 
1024+0 records out 
402653184 bytes (403 MB, 384 MiB) copied, 41.1578 s, 9.8 MB/s 
 

With SDMMC as we don’t know what SD card will be inserted the clock is set to default 40Mhz.

After re-checking that everything works, we make last cosmetic changes to audio part we noticed in the last moment and will run Rev.C in production.

A64-OLinuXino-eMMC rev.B OSHW 64 bit ARM development board prototypes are testing

A64-OLinuXino-1

A64-OLinuXino-2

What you see is our improved REV.B of A64-OLinuXino. What’s new:

  • Gigabit PHY is now KSZ9031 from MICROCHIP/MICREL which allow board to be produced in both commercial and industrial grade!
  • DDR3 is now DDR3L for lower power
  • we add SPI flash footprint U12
  • Audio input now is jumper selectable between LINE-IN and MIC-IN
  • eMMC now can work on software selectable voltage 3.3V or 1.8V which would allow faster speeds
  • status LED is attached to port PE17
  • size 90×60 mm

Now we do final software tests and if everything is OK we will run production.

 

TERES-I DIY Open Source Hardware hacker’s Laptop update

keyb

It’s have been long time since I blogged about our laptop project.

What is the status – we have first PCBs prototyped and most of parts works fine.

We had to make Matrix keyboard + I2C touchpad to USB converter board. We did this with small AVR.

For this project we couldn’t use any of our standard connectors – we had to source all new: mini HDMI connectors, USB host connectors, power jack, audio jack connectors all they had to be low profile and embedded inside the PCB, hence this off form of the main PCB:

PCB

The LCDs used in laptops are not as the normal LCDs, they are very thin only 3mm or less and as their cable is special as must have as low as possible number of thin wires knitted together in very thin round cable, is has to go through laptop plastic’s hinges and normal cable can’t fit there. This is why all laptop LCDs are not parallel RGB neither LVDS but use eDP interface.

For bad luck A64 do not support such interface so we start to search LVDS/HDMI/RGB to eDP converter ICs. What we found is that Western suppliers solutions (TI etc) are more expensive than A64 chip itself so no go. We found Chinese solution for $1 NCS8801 and we said – well this is our solutionūüôā we made PCBs prototype and sourced few chips then we struggled by the lack of documentationūüôā The ‘datasheet’ is 30 pages and the only code which is on the net initializes registers at addresses not mentioned in the datasheet, after spending almost 4 weeks on this we gave up and start looking for another solution. We found ANX6345 which is a bit more expensive but has some code in Linux Kernel and seems used with Rockchip ICs, so we hope this to solve LCD issue. We designed new board and got the new prototypes few days ago so they wait open window on assembly line to be assembled, crossing fingers everything to workūüôā

The mechanical parts has their history too. In June we placed orders to several different suppliers for the plastic parts, speakers, touchpads, power adapters, screws, hinges, total 40 different parts which are inside the laptop. The orders were complete in July and consolidated as one shipment on August 6 they were expressed with TNT and 2 days later were at Sofia airport, but the troubles just beganūüôā

To import something may seems very easy for outsiders, but has it’s tricks. Usually every component can be classified in several positions in customs tariff, for instance LCDs have at least 7-8 different codes at which they can be imported, like they can be classified as display for computing equipment, as display for TV, as display for signage, as display for metal processing machine, etc etc. The trouble is that all these positions had different import taxūüôā and of course Bulgarian customs try to force you to pay on the highest tariff code unless you prove them other. Another issue is that there work mostly people with economic education and very few know electronics matter. Import tax starts from 0% for computer parts and go up to 4-5%¬†for TVs and machines, not small amount when you talk for $200 laptop parts! So laptop parts were sitting on customs 3 weeks as customs officers were trying to tariff every hinge, screw, plastic etc part as different product to tariff it with the highest code. Fortunately after 3 weeks of thinking somebody with common sense allowed all laptop spare parts to be imported as such with 0% tax and we got them today, but the fight will continue as this was only 10% of the order which we wanted to receive promptly paying expensive air transport, remain 90% parts still travel by sea and will arrive end of September, so let’s see how they will tariff these when arriveūüôā

We get lot of request when the laptop will be done and we love all our impatient customersūüôā

Guys be sure that we do anything humanly possible to release it as soon as we can, but to design something from scratch which you had never did before is not easy, once we do this I’m sure we will easily make 10 other laptops, but first time is always more difficult, to arrange logistic of so many parts and produce is not less challenging.

 

P.S. I hope you like¬†the “Super” key on our new keyboard aboveūüôā

CT800 – embedded FLOSS Chess computer made with STM32-H405 OSHW board

sideview

We got interesting project link from Rasmus Althoff: CT800 is Free/Libre Open Source Software CHESS computer made with STM32-H405 Open Source Hardware board inside.

It has around 2100 ELO, maximum search depth 20 plies. The software is done in C and released under GPL3 licensee.

FPGA tutorial – VGA video generation with iCE40HX1K-EVB + iCE40-IO in Verilog

iCE40-IO-1

iCE40-IO is Open Source Hardware snap-to module for iCE40HX1K-EVB which adds VGA, PS2 and IrDA transciever.

In this tutorial you will learn how to generate VGA video signals, how to capture PS2 keys and how to move object on the video screen.

Here is my setup:

setup

I have iCE40HX1K-EVB snap to iCE40-IO with PS2 keboard and VGA connected to it and OLIMEXINO-32U4 as programmer

The tutorial project is on GitHub.¬†Let’s first see example_0.v

Yesterday after sharing my experience with Verilog to silently define signals which you could have type by mistake, there was comment by Andrew Zonenberg, who wrote that you can tell Verilog to consider this error by adding “`default_nettype none” as your first line code. I check and it works fine, so I will use it in all my further sourcesūüôā Thanks for the tip Andrew!

The code starts with:

   `default_nettype none //disable implicit definitions by Verilog

   module top( //top module and signals wired to FPGA pins
    CLK100MHz,
    vga_r,
    vga_g,
    vga_b,
    vga_hs,
    vga_vs,
    ps2_clk,
    ps2_data
   );

 

here we define top module and what physical signals we will use, these are the CLK100Mhz, VGA R,G,B, H-sync, V-sync, ps2 clock and data

then we must define each of them:

   input CLK100MHz; // Oscillator input 100Mhz
   output [2:0] vga_r; // VGA Red 3 bit
   output [2:0] vga_g; // VGA Green 3 bit
   output [2:0] vga_b; // VGA Blue 3 bit
   output vga_hs; // H-sync pulse 
   output vga_vs; // V-sync pulse
   input ps2_clk; // PS2 clock
   input ps2_data; // PS2 data

 

as you can see VGA R,G,B signals are 3 bit registers, this way we defini VGA to have 9bit color or 512 different colors

the next part use new keyword parameter, this is how the constants are defined in Verilog:

  parameter h_pulse = 96; //H-SYNC pulse width 96 * 40 ns (25 Mhz) = 3.84 uS
  parameter h_bp = 48; //H-BP back porch pulse width
  parameter h_pixels = 640; //H-PIX Number of pixels horisontally
  parameter h_fp = 16; //H-FP front porch pulse width
  parameter h_pol = 1'b0; //H-SYNC polarity
  parameter h_frame = 800; //800 = 96 (H-SYNC) + 48 (H-BP) + 640 (H-PIX) + 16 (H-FP)
  parameter v_pulse = 2; //V-SYNC pulse width
  parameter v_bp = 33; //V-BP back porch pulse width
  parameter v_pixels = 480; //V-PIX Number of pixels vertically
  parameter v_fp = 10; //V-FP front porch pulse width
  parameter v_pol = 1'b1; //V-SYNC polarity
  parameter v_frame = 525; // 525 = 2 (V-SYNC) + 33 (V-BP) + 480 (V-PIX) + 10 (V-FP)

  parameter square_size = 10; //size of the square we will move
  parameter init_x = 320; //initial square position X
  parameter init_y = 240; //initial square position Y

 

for VGA timing we will use 25Mhz clock which is made by division by 4 of CLK100Mhz:

  reg [1:0] clk_div; // 2 bit counter
  wire vga_clk;

  assign vga_clk = clk_div[1]; // 25Mhz clock = 100Mhz divided by 2-bit counter

  always @ (posedge CLK100MHz) begin // 2-bt counter ++ on each positive edge of 100Mhz clock
   clk_div <= clk_div + 2'b1;
  end

 

vga_clk is the bit2 of clk_div which is incrementing on each positive edge of 100Mhz clock

then we define the registers which will hold the VGA signals:

  reg [2:0] vga_r_r; //VGA color registers R,G,B x 3 bit
  reg [2:0] vga_g_r;
  reg [2:0] vga_b_r;
  reg vga_hs_r; //H-SYNC register
  reg vga_vs_r; //V-SYNC register

  assign vga_r = vga_r_r; //assign the output signals for VGA to the VGA registers
  assign vga_g = vga_g_r;
  assign vga_b = vga_b_r;
  assign vga_hs = vga_hs_r;
  assign vga_vs = vga_vs_r;

 

we do want the video generation to start after some time not immediately, and for this we will use two signals:

  reg [7:0] timer_t = 8'b0; // 8 bit timer with 0 initialization
  reg reset = 1;

 

8 bit timer will make the necessary delay, reset is internal signal and have nothing in common with the reset button on the board

these registers will hold info where the “video beam” is when the video is generated, we need two of them as one will hold the complete frame even with some of “invisible” video frame, the other just the visible part

  reg [9:0] c_row; //complete frame register row
  reg [9:0] c_col; //complete frame register colum
  reg [9:0] c_hor; //visible frame register horisontally
  reg [9:0] c_ver; //visible frame register vertically

 

this signal flags if the display is enabled or disabled

  reg disp_en; //display enable flag

 

these registers will hold the center coordinates of the visible square we draw on the screen:

  reg [9:0] sq_pos_x; //position of square center X, Y
  reg [9:0] sq_pos_y;

 

these registers will hold the upper left and down right coordinates of the square we draw:

  wire [9:0] l_sq_pos_x; //upper left and down right corners of the square
  wire [9:0] r_sq_pos_x;
  wire [9:0] u_sq_pos_y;
  wire [9:0] d_sq_pos_y;

  assign l_sq_pos_x = sq_pos_x - square_size;
  assign r_sq_pos_x = sq_pos_x + square_size;
  assign u_sq_pos_y = sq_pos_y - square_size;
  assign d_sq_pos_y = sq_pos_y + square_size;

 

the next registers are for reading the PS2 keyboard:

  reg [3:0] ps2_cntr; // 4-bit PS2 clock counter
  reg [7:0] ps2_data_reg; // 8-bit PS2 data register
  reg [7:0] ps2_data_reg_prev; // previous 8-bit PS data register
  reg [7:0] ps2_data_reg_prev1; // previous previous 8-bit data register
  reg [10:0] ps2_dat_r; // 11-bit complete PS2 frame register

  reg [1:0] ps2_clk_buf; // PS2 clock buffer
  wire ps2_clk_pos; // PS2 positive edge detected signal

  reg u_arr = 0; //PS2 arrow keys detect flags
  reg l_arr = 0;
  reg d_arr = 0;
  reg r_arr = 0;

the 4-bit counter is for PS2 clock, the three data registers hold three sequential key codes as some keys are transmitted as two bytes when press and three when released
ps2_clk_buf is used to detect rising edge of the PS2 clock:

  assign ps2_clk_pos = (ps2_clk_buf == 2'b01); 
         // edge detector positive edge is when the buffer is '10'

 

25Mhz clock is used to detect PS2 clock and data:

  always @ (posedge vga_clk) begin // on each positive edge at 25Mhz clock
    ps2_clk_buf[1:0] <= {ps2_clk_buf[0], ps2_clk}; 
             // shift old value left and get current value of ps2_clk
    if(ps2_clk_pos == 1) begin // on positive edge
     ps2_cntr <= ps2_cntr + 1;
     if(ps2_cntr == 10) begin 
         // when we got 10 clocks save the PS2 data to ps2_data_reg, 
         // ps2_data_reg_prev and ps2_data_reg_prev1
      ps2_cntr <= 0; // so we have last 3 data values captured from PS2 keyboard
      ps2_data_reg[7] <= ps2_dat_r[0];
      ps2_data_reg[6] <= ps2_dat_r[1];
      ps2_data_reg[5] <= ps2_dat_r[2];
      ps2_data_reg[4] <= ps2_dat_r[3];
      ps2_data_reg[3] <= ps2_dat_r[4];
      ps2_data_reg[2] <= ps2_dat_r[5];
      ps2_data_reg[1] <= ps2_dat_r[6];
      ps2_data_reg[0] <= ps2_dat_r[7];
      ps2_data_reg_prev <= ps2_data_reg;
      ps2_data_reg_prev1 <= ps2_data_reg_prev;
    end
    ps2_dat_r <= {ps2_dat_r[9:0], ps2_data}; // data shift left
   end

at this point we have detected when the PS2 keyboard start sending data and captured the transmitted data

here is where we detect is left, right, up and down keys are pressed:

  if(ps2_data_reg_prev1 == 8'he0 && ps2_data_reg_prev == 8'hf0) begin 
      // 0xE0 0xF0 sequence means key released
    if(ps2_data_reg == 8'h75) begin
      u_arr <= 0; //0x75 up key
    end
    else if(ps2_data_reg == 8'h6b) begin
      l_arr <= 0; //0x6B left key
    end
    else if(ps2_data_reg == 8'h72) begin
      d_arr <= 0; //0x72 down key
    end
    else if(ps2_data_reg == 8'h74) begin
      r_arr <= 0; //0x74 right key
    end
  end
  if(ps2_data_reg_prev == 8'he0) begin //0xE0 means key pressed
    if(ps2_data_reg == 8'h75) begin
      u_arr <= 1; //0x75 up key
    end
    else if(ps2_data_reg == 8'h6b) begin
      l_arr <= 1; //0x6B left key
    end
    else if(ps2_data_reg == 8'h72) begin
      d_arr <= 1; //0x72 down key
    end
    else if(ps2_data_reg == 8'h74) begin
      r_arr <= 1; //0x74 right key
    end
  end
 end

 

Now let’s generate the video signal:

  always @ (posedge vga_clk) begin //25Mhz clock
 
   if(timer_t > 250) begin // generate 10 uS RESET signal
     reset <= 0;
   end
   else begin
     reset <= 1; //while in reset display is disabled, suare is set to initial position
     timer_t <= timer_t + 1;
     disp_en <= 0;
     sq_pos_x <= init_x;
     sq_pos_y <= init_y;
   end

 

with timer_t we generate initial 10 uS RESET signal where display is not active and we load initial XY coordinates in the middle of the visible area

this code updates current beam position:

  if(reset == 1) begin //while RESET is high init counters
    c_hor <= 0;
    c_ver <= 0;
    vga_hs_r <= 1;
    vga_vs_r <= 0;
    c_row <= 0;
    c_col <= 0;
  end
  else begin // update current beam position
    if(c_hor < h_frame - 1) begin
       c_hor <= c_hor + 1;
    end
    else begin
       c_hor <= 0;
       if(c_ver < v_frame - 1) begin
          c_ver <= c_ver + 1;
       end
       else begin
          c_ver <= 0;
       end
    end
 end

 

H-sync and V-sync generation:

   if(c_hor < h_pixels + h_fp + 1 || c_hor > h_pixels + h_fp + h_pulse) begin 
     // H-SYNC generator
     vga_hs_r <= ~h_pol;
   end
   else begin
     vga_hs_r <= h_pol;
   end
   if(c_ver < v_pixels + v_fp || c_ver > v_pixels + v_fp + v_pulse) begin 
     //V-SYNC generator
     vga_vs_r <= ~v_pol;
   end
   else begin
     vga_vs_r <= v_pol;
   end
   if(c_hor < h_pixels) begin //c_col and c_row counters are 
                    //updated only in the visible time-frame
     c_col <= c_hor;
   end
   if(c_ver < v_pixels) begin
     c_row <= c_ver;
   end
   if(c_hor < h_pixels && c_ver < v_pixels) begin //VGA color signals are 
                   //enabled only in the visible time frame
     disp_en <= 1;
   end
   else begin
     disp_en <= 0;
   end

 

now to draw the read frame, blue square:

if(disp_en == 1 && reset == 0) begin
 if(c_row == 0 || c_col == 0 || c_row == v_pixels-1 || c_col == h_pixels-1) begin //generate red frame with size 640x480
 vga_r_r <= 7;
 vga_g_r <= 0;
 vga_b_r <= 0;
 end
 else if(c_col > l_sq_pos_x && c_col < r_sq_pos_x && c_row > u_sq_pos_y && c_row < d_sq_pos_y) begin //generate blue square
 vga_r_r <= 0;
 vga_g_r <= 0;
 vga_b_r <= 7;
 end
 else begin //everything else is black
 vga_r_r <= 0;
 vga_g_r <= 0;
 vga_b_r <= 0;
 end
 end
 else begin //when display is not enabled everything is black
 vga_r_r <= 0;
 vga_g_r <= 0;
 vga_b_r <= 0;
 end

you can change the colors by editing the RGB values above

once per frame update the square position depend on key pressed:

  if(c_row == 1 && c_col == 1) begin //once per video frame
    if(u_arr) begin
      sq_pos_y <= sq_pos_y - 1;
    end;

  if(d_arr) begin
    sq_pos_y <= sq_pos_y + 1;
  end;

  if(l_arr) begin
    sq_pos_x <= sq_pos_x - 1;
  end;

  if(r_arr) begin
    sq_pos_x <= sq_pos_x + 1;
   end;
  end

 

now let’s save the code as example.v, synthesize and program.
Here is what we see:

video-1

 

when we press and hold arrow keys the square is moving across the screen yey!

 

move

but there is problem if we reach the end of frame the square go outside itūüôā

 

crop

 

How we can fix this?

Let’s go again to the code which describe the position update, obviously we have to add another if with checking if the square is at the frame ends:

  if(c_row == 1 && c_col == 1) begin //once per video frame
    if(u_arr) begin
      if (sq_pos_y > square_size) begin
        sq_pos_y <= sq_pos_y - 1;
      end
    end;

    if(d_arr) begin
      if (sq_pos_y < (v_pixels - 1 - square_size)) begin
        sq_pos_y <= sq_pos_y + 1;
      end
    end;

  if(l_arr) begin
    if (sq_pos_x > square_size) begin
      sq_pos_x <= sq_pos_x - 1;
    end
  end;

  if(r_arr) begin
    if (sq_pos_x < (h_pixels - 1 - square_size)) begin
      sq_pos_x <= sq_pos_x + 1;
    end
  end;
 end

 

now the square will never go outside! let’s save the code ¬†(it’s also saved on GitHub as example_1.v) and synthesize and program:

 

wallhit

 

OK, what else we can change? To keep the button pressed all the time to move the square is boring, let’s make it to move once we just press and release the key without need to keep it all the time pressed.
we can do this by commenting this code which clears the key flags:

/* if(ps2_data_reg_prev1 == 8'he0 && ps2_data_reg_prev == 8'hf0) begin // 0xE0 0xF0 sequaence means key released
 if(ps2_data_reg == 8'h75) begin
 u_arr <= 0; //0x75 up key
 end
 else if(ps2_data_reg == 8'h6b) begin
 l_arr <= 0; //0x6B left key
 end
 else if(ps2_data_reg == 8'h72) begin
 d_arr <= 0; //0x72 down key
 end
 else if(ps2_data_reg == 8'h74) begin
 r_arr <= 0; //0x74 right key
 end
 end
 */

 

Now even when you press the key once the square keep moving in this direction until hit the ‘wall’ then stops! This code is saved on GitHub as example_2.v.

Can we make it bounce? Sure we can, we just have to update key status with reverse key when the square hit the wall:

 

  if(c_row == 1 && c_col == 1) begin //once per video frame
    if(u_arr) begin
      if (sq_pos_y > square_size) begin
        sq_pos_y <= sq_pos_y - 1;
      end
      else begin  // change direction when hit wall
        u_arr <= 0;
        d_arr <= 1;
      end
   end;

  if(d_arr) begin
    if (sq_pos_y < (v_pixels - 1 - square_size)) begin
      sq_pos_y <= sq_pos_y + 1;
    end
    else begin
      d_arr <= 0;
      u_arr <= 1;
    end
 end;

 if(l_arr) begin
   if (sq_pos_x > square_size) begin
     sq_pos_x <= sq_pos_x - 1;
   end
   else begin
     l_arr <= 0;
     r_arr <= 1;
   end
 end;

 if(r_arr) begin
   if (sq_pos_x < (h_pixels - 1 - square_size)) begin
     sq_pos_x <= sq_pos_x + 1;
   end
   else begin
     r_arr <= 0;
     l_arr <= 1;
   end
 end;

end

 

Let’s save and compile! What? We got error!

 example.blif:1750: fatal error: net `d_arr' has multiple drivers
 Makefile:11: recipe for target 'example.asc' failed
 make: *** [example.asc] Error 1

 

What does this means? d_arr register where we store the key direction has multiply drivers! Looking in the code we see that we assign d_arr in two different always blocks.
In FPGA all processes are performed in parallel, so if we assign one signal in two different blocks we will never know which assignment when is performed and this is considered error in the behavior description.
What we see is that both always blocks are executed on positive edge of vga_clk, so we can just merge them by copy:

 ps2_clk_buf[1:0] <= {ps2_clk_buf[0], ps2_clk}; // shift old value left and get current value of ps2_clk
   if(ps2_clk_pos == 1) begin // on positive edge
     ps2_cntr <= ps2_cntr + 1;
   if(ps2_cntr == 10) begin // when we got 10 clocks save the PS2 data to ps2_data_reg, ps2_data_reg_prev and ps2_data_reg_prev1
     ps2_cntr <= 0; // so we have last 3 data values captured from PS2 keyboard
     ps2_data_reg[7] <= ps2_dat_r[0];
     ps2_data_reg[6] <= ps2_dat_r[1];
     ps2_data_reg[5] <= ps2_dat_r[2];
     ps2_data_reg[4] <= ps2_dat_r[3];
     ps2_data_reg[3] <= ps2_dat_r[4];
     ps2_data_reg[2] <= ps2_dat_r[5];
     ps2_data_reg[1] <= ps2_dat_r[6];
     ps2_data_reg[0] <= ps2_dat_r[7];
     ps2_data_reg_prev <= ps2_data_reg;
     ps2_data_reg_prev1 <= ps2_data_reg_prev;
   end
   ps2_dat_r <= {ps2_dat_r[9:0], ps2_data}; // data shift left
 end

 if(ps2_data_reg_prev == 8'he0) begin //0xE0 means key pressed
   if(ps2_data_reg == 8'h75) begin
     u_arr <= 1; //0x75 up key
   end
   else if(ps2_data_reg == 8'h6b) begin
     l_arr <= 1; //0x6B left key
   end
   else if(ps2_data_reg == 8'h72) begin
     d_arr <= 1; //0x72 down key
   end
   else if(ps2_data_reg == 8'h74) begin
     r_arr <= 1; //0x74 right key
   end
 end

 

after the video generation and delete of first always block. In GitHub this code is saved as example_3.v

Now code is synthesized and we can program the FPGA. The square is bouncing to the frame every time it hit it!

We will leave up to you to hack further like to change square move speed etc!

 

Hello World! with Verilog on iCE40HX1K-EVB with open source tool IceStorm

iCE40HX1K-EVB-1

One of the workshops at TuxCon 2016 included using Open Source Hardware FPGA board iCE40HX1K-EVB and there we went through the development process with FPGA and Verilog.

For those of you who were unable to attend, we will now show you what you’ve missed. First, see the previous post about how to setup FPGA FOSS IceStorm tools here.

Now that the¬†tools are set, let’s learn some more about FPGA. This is a¬†very brief introduction and it is far from comprehensive, but the Internet has tons of resources you can use to learn more.

We will go through most asked questions on the workshop only:

What is FPGA ?

FPGA stands for Field Programmable Gate Array. They are digital integrated circuits (ICs) that contain configurable (programmable) blocks of logic along with configurable interconnections between these blocks. Design engineers can configure, or program, such devices to perform a variety of tasks.

How many times can one FPGA be programmed?

Some FPGAs may only be programmed a single time (they are called OTP) while others may be reprogrammed over and over again. For development boards we need the latter because when we develop we often make mistakes and we need to be able to program FPGAs multiple times. The FPGAs which can be programmed many times usually have external non-volatile memory. It contains the configuration file which is read at power up to the local RAM inside FPGA, and is used to define the interconnections between the blocks inside FPGA. So when you apply power to these FPGA they need some small amount of time to read their program and then start working.

When are FPGA used in one design?

FPGAs allow many tasks to be performed in parallel at very high speed. They are also highly integrated (some FPGAs have millions of programmable blocks), so you can complete complex hardware designs in a very small space. The trade off is that FPGA are programmed differently than the micro controllers (as you will see later), so they require a little bit more studying in order to get used to them.
If you application requires high speed, and complex parallel tasks, you need FPGA. Typical applications are: digital signal processing as video and audio filtering, the FPGA outperform fastest DSPs in factor of 500. Another applications are developing new digital ICs like processors or microcontrollers  with new architectures and instructions. FPGA are used also for physical layer communications, decoding and encoding high speed communication lines like HDMI, SATA, USB.
There is no sense to use FPGA in slow processes which can be done by microcontrollers, but they can be used to add fast peripherals to them. For example if you need very fast SPI to capture some fast serial signal, most of microcontrollers have SPIs which work up to 20-30Mhz clock, with FPGA you can make SPI which work on 100 Mhz or 200Mhz or 300Mhz and to buffer the data then to re-transmit slowly to the microcontroller who to do something with this data.
You can synthesize almost any digital circuit with FPGA, to make your own microprocessor with custom number of registers and instruction set, most of the companies which design microprocessors / microcontrollers first test their ideas on FPGAs.

How FPGAs are programmed (configured)?

Back in 1984 when the first FPGAs were made, design flows used for CPLD was taken and they were programmed by drawing schematics of digital circuits, then the CAD tool synthesized the schematic to FPGA configuration files which you can load to the FPGAs. This approach works well, but when the FPGAs become with thousands of logic cells and the schematics become more than several pages long the process become prone to errors exponentially with the size of the schematic. (Just imagine to draw internal schematic on modern processor with digital logic and then to test it).
At the end 1980s move toward HDL (hardware description languages) was made. Visualizing, capturing, debugging, understanding, and maintaining a design at the gate level of abstraction became increasingly difficult and inefficient when juggling thousands gates.
The lowest level of abstraction for a digital HDL is switch level, which describe the circuit as a netlist of transistor switches.
A higher level of abstraction is the gate level,which describe the circuit as a netlist of primitive logic gates and functions.
The next level of HDL abstraction is the ability to support functional representations using Boolean equations.
The highest level of abstraction sported by traditional HDLs is known as behavioral, which describe the behavior of a circuit using abstract constructs like loops and processes similar to programming language.

Verilog and IceStorm

Verilog is one such HDL behavior language, another one very popular in Europe is VHDL, but as FOSS FPGA tool for iCE40 IceStorm has support for only Verilog we will make all next demos in Verilogūüôā.

Let have look at the first Blink LED project we programmed on iCE40HX1K-EVB in the previous blog post. It’s available on GitHub.

The Makefile

This is configuration file for the project which tells how IceStorm to compile it:

    PROJ = example

 

this is project name, it could be any other name, IceStorm will search for example.v source file and the result at the end will be example.bin which you can program to iCE40HX1K-EVB

    PIN_DEF = ice40hx1k-evb.pcf

 

this is external file which assigns the signals we will use in the project to the physical chip pin numbers, if we open it will see:

    set_io CLK 15
    set_io BUT1 41
    set_io BUT2 42
    set_io LED1 40
    set_io LED2 51

 

which means the 100 Mhz Oscillator clock is connected to pin15, button1 to pin41, LED1 to pin40 and so on.

    DEVICE = hx1k

 

this tells IceStorm which device is used, in this case device from HX series with 1K logic blocks

    yosys -p 'synth_ice40 -top top -blif $@' $<

 

invokes yosys to syntheses example.v Verilog sources ‘top’ is the name of the top module you could assume it as something like main() in C language.

    arachne-pnr -d $(subst hx,,$(subst lp,,$(DEVICE))) -o $@ -p $^ -P vq100

 

after yosys has synthesized the sources ‘arachne-pnr’ try to place and route them physically inside the chip, you can imagine these logic cells are as matrix and this tool have to decide how to arrange them so to make smaller distances between the connected cells and physical pins, and design to work at maximal possible speed. Look at -P vq100 switch it tells arachne-pnr what package is used for the device in our case VQ100 chip package.

    icepack $< $@

 

packs the text file output generated by arachne-pnr to .bin file read to be programmed in FPGA external Flash memory

icetime -d $(DEVICE) -mtr $@ $<

 

The icetime program is an iCE40 timing analysis tool. It reads designs in IceStorm ASCII format and writes times timing netlists that can be used in external timing analysers. It also includes a simple topological timing analyser that can be used to create timing reports.

    sudo iceprogduino $<

 

small program which uses OLIMEXINO-32U4 (Arduino Leonardo) with custom firmware as programmer for the iCE40HX1K-EVB SPI Flash

example.v

    module top( //top module
       CLK,
       BUT1,
       BUT2,
       LED1,
       LED2
    );

 

this describes the ‘top’ module in the code which will be synthesised, it will use some physical signals defined in ice40hx1k-evb.pcf

then we define what are these signals inputs or outputs:

    input CLK;    //input 100Mhz clock
    input BUT1;   //input signal from button 1
    input BUT2;   //input signal from button 2
    output LED1;  //output signal to LED1
    output LED2;  //output signal to LED2

 

with the keyword ‘reg’ we define registers i.e. analog of variables in programming language, but here these are with default width of 1 bit, in the registers we can store and read signals

    reg BUT1_r;           //register to keep button 1 state
    reg BUT2_r;           //register to keep button 2 state
    reg LED1_m0_r;        //LED1 value in mode = 0
    reg LED2_m0_r;        //LED2 value in mode = 0
    reg LED1_m1_r;        //LED1 value in mode = 1
    reg LED2_m1_r;        //LED2 value in mode = 1
    reg [14:0] cntr;      // 15 bit counter for LED blink timing
    reg [14:0] rst_cnt=0; // 15 bit counter for button debounce
    reg mode=1;           //mode set to 1 initially
    reg [11:0] clk_div;   // 12 bit counter

 

you can see that cntr and rst_cntr are with [14:0] in front of them, this means they are 15 bit long registers, clk_div is 12 bit

with the keyword wire you define internal signals which are additional to these defined in the top module

    wire clk_24KHz; //signal with approx 24KHz clock
    wire reset;     //used for button debounce

the keyword assign makes connection between signals, so every time right side signal changes the same change occur at the left side signal

    assign reset = rst_cnt[14]; //reset signal is connected to bit15 of rst_cnt
    assign LED1 = mode ? LED1_m1_r : LED1_m0_r; //multiplexer controlled  
                      //by mode connects LED1_m1_r or LED1_m0_r to LED1
    assign LED2 = mode ? LED2_m1_r : LED2_m0_r; //multiplexer controlled  
                      //by mode connects LED2_m1_r or LED2_m0_r to LED2
    assign clk_24KHz = clk_div[11];      //100Mhz/4096= 24414 Hz

 

in this case 15th bit of rst_cnt register is connected to signal reset, signal clk_24KHz is connected to 12th bit of clk_div register
LED1 and LED2 are connected via multiplexers (made with ? keyword) with control signal mode¬†to two registers with suffix ‘m1’ and ‘m0’
so when mode is 0 LED1 will be connected to LED1_m0_r register and when mode is 1 to LED1_m1_r

always block is executed every time when something in his sensitivity list changes:

    always @ (posedge CLK) begin      //on each positive edge of 100Mhz clock increment clk_div
       clk_div <= clk_div + 12'b1;
    end

 

in this case every time positive edge of CLK is happen i.e. CLK change from 0 to 1 it’s executed and adds 1 to clk_div

next always block is a bit more complex:

  always @ (posedge clk_24KHz) begin //on each positive edge of 24414Hz clock
     BUT1_r <= BUT1;       //capture button 1 state to BUT1_r
     BUT2_r <= BUT2;       //capture button 2 state to BUT2_r
     cntr <= cntr + 15'd1; //increment cntr LED blink counter
 
  if(reset == 1'b0) begin //if bit15 of rst_cnt is not set yet
     rst_cnt <= rst_cnt + 15'd1; //increment the counter rst_cnt
  end

  if(BUT1_r == 1'b0 && BUT2_r == 1'b0 && reset == 1'b1) begin 
       //if bit15 of rst_cnt is set and both buttons are pressed
    mode <= mode ^ 1'b1; //toggle the mode
    rst_cnt <= 15'd0; //clear debounce rst_cnt
  end
 
  LED1_m0_r <= ~BUT1_r; //copy inv state of button 1 to LED1_m0_r
  LED2_m0_r <= ~BUT2_r; //copy inv state of button 2 to LED2_m0_r
 
  if(cntr == 15'd12207) begin //when 0.5s pass
    LED1_m1_r <= 1'b0; //reset LED1_m1_r
    LED2_m1_r <= 1'b1; //set LED2_m1_r
  end

  if(cntr > 15'd24414) begin //when 1.0s pass
    cntr <= 15'd0; //clear cntr
    LED1_m1_r <= 1'b1; //set LED1_m1_r
    LED2_m1_r <= 1'b0; //reset LED2_m1_r
  end
end

what happens here? every time at positive edge of clk_24KHz :
in BUT1_r and BUT2_r is loaded the current state of the buttons,
cntr is incremented with 1, this is our LED blink frequency counter
clk_24KHz is not actually exactly 24KHz but 100 000 000 Hz / 4096 = 24414 Hz or 24.414KHzūüôā

when this cntr reach value 12207 i.e. half second pass LED1_m1_r is loaded with 0 and LED2_m1_r is loaded with 1
when this cntr reach value 24414 i.e. one second pass LED1_m1_r is loaded with 1 and LED2_m1_r is loaded with 0
i.e. if mode is 1 the LED1 and LED2 will blink each half second.

when the mode is 0 LED1_m0_r and LED2_m0_r will follow button states i.e. in this mode when you press button 1, LED1 will be on and when you release button 1 LED1 will be off
same will be for LED2 too

Now let pay some more attention to what this code describes:

   if(reset == 1'b0) begin //if bit15 of rst_cnt is not set yet
     rst_cnt <= rst_cnt + 15'd1; //increment the counter rst_cnt
   end

   if(BUT1_r == 1'b0 && BUT2_r == 1'b0 && reset == 1'b1) begin 
           //if bit15 of rst_cnt is set and both buttons are pressed
     mode <= mode ^ 1'b1;  //toggle the mode
     rst_cnt <= 15'd0;    //clear debounce rst_cnt
   end

reset is signal connected to rst_cnt 15th bit, so until this bit is set rst_cnt will be incremented on every positive edge of clk_24KHz,
when reset is set to 1 if BUT1 and BUT2 are pressed together the mode is toggled and res_cnt is set to 0 to ensure some debounce time

you can download the project and make and program with these two lines:

    make

    make prog

You will see first LED1 and LED2 to blink as default mode is 1. If you want to toggle the mode press and hold BUT1 and BUT2 and release them quickly.

LED1 and LED2 will switch off, in this mode if you press BUT1 will switch on LED1 and if you press BUT2 will switch on LED2, if you press the both buttons together mode will change again to 1 and LED1 and LED2 will start blinking.

Your first program is done!

Now let see what will happen if we change line 48 from

   mode <= mode ^ 1'b1; //toggle the mode

to:

   model <= mode ^ 1'b1; //toggle the mode

i.e. we made mistake and instead of mode wrote model what do you think will be there error message when synthesis is done?

You can try! Whaaaat? everything completes correctly and you get your example.bin ready for program. What happens when we run it? Right! The LED1 and LED2 blinks and you can’t change the mode by pressing BUT1 and BUT2 together anymore!

OMG how this happens? Welcome to the wonderful¬†world of Verilogūüôā If you do not define but use new signal Verilog silently creates it and just issue WARNING not error, in this case the warning is in the very beginning of the 1233 lines of messages you see printed while the source is synthesized:

    Parsing Verilog input from `example.v' to AST representation.
    Generating RTLIL representation for module `\top'.
    Warning: Identifier `\model' is implicitly declared at example.v:48.
    Successfully finished Verilog frontend.

 

This feature may make you bang your head to the wall searching for errors and can’t happen in VHDL, where everything have to be strictly defined before to be used.

VHDL vs Verilog is like old C vs Pascal choice. In C you can do lot of things to shoot yourself in the leg and the compiler will not stop you.

 

In the next FPGA blog post we will go deeper and will show you how to generate VGA video signals with iCE40HX1-EVB + iCE40-IO boards and how to move object on the screen with the arrow keys of PS2 keyboard.

And we will not stop here, we are preparing more tutorials with iCE40HX1-EVB + iCE40-IO Рvideo games Snake and Flappy bird. Then latter we will teach you how to build Digital Storage Oscilloscope with iCE40HX1-EVB + iCE40-IO+ iCE40-ADC , how to make Digital Logic Analyzer with  iCE40HX1-EVB +iCE40-DIO for sniffing protocols from devices operating from 1.65 to 5.5V levels and how to make DDS generator of signals with any forms using iCE40HX1-EVB + iCE40-DAC.

 

EDIT: As I wrote we learn this stuff too! Regarding the implicit declarations they may be disabled by adding on top of your code:

    `default_nettype none

 

I just try this and yosys stops with error when I mistype ‘mode’ with ‘model’:

Parsing Verilog input from `example.v' to AST representation.
Generating RTLIL representation for module `\top'.
ERROR: Identifier `\model' is implicitly declared at example.v:50 and `default_nettype is set to none.
Makefile:8: recipe for target 'example.blif' failed
make: *** [example.blif] Error 1

Previous Older Entries