Category Archives: Uncategorized

PandAxICT5

2 minutes for a pitch at H2020 Info Day http://panda.dei.polimi.it/wp-content/uploads/ICT05-PandA.pdf https://ec.europa.eu/digital-single-market/en/news/h2020-info-day-factories-future-12-ict-5-and-ict-31-ict-innovation-manufacturing-smes-i4ms #ICT5 #DSMeu #UE #PandA4Design

PandA 0.9.3 released

New features introduced:
– general improvement of performances of generated circuits
– added full support to GCC 4.9 family which is now the default
– improved retrieving of GCC alias analysis information
– added first version of VHDL backend
– added support to CycloneV
– added support to Artix7
– extended support to Virtex7 boards family
– added option –top-rtldesign-name that controls which is the function to be synthesized by the RTL backed
– it is now possible to write the testbench in C instead of using the xml file
– added a first experimental backend to yosys (yosys link )
– added examples/crc_yosys which tests yosys backend and C based testbenches
– improved Verilog testbench generation: it is now fully compliant with cycle based simulators (e.g., VERILATOR)
– added option –backend-script-extensions to pass further constraints to the RTL synthesis (e.g., pin assignment)
– added examples/VGA showing how to integrate existing HDL based IPs in a real FPGA design
– added scripts and results for CHStone synthesis of Lattice based designs
– improved support of complex numbers
– single precision soft-float functions redesigned: now –soft-float is the default and –flopoco becomes optional
– single precision floating point division implemented exploiting Goldshmidt algorithm
– improved synthesis of libm functions
– improved libm regression test
– improved architectural timing model
– improved graphviz representation of FSMs: timing information has been added
– added option –post-rescheduling to further improve the resource usage
– parameter registering is now performed and it can be controlled by using option –registered-inputs
– added a full implementation of Bit Value analysis and coupled with Value Range analysis performed by GCC
– added option –experimental-setup to control bambu defaults:
* BAMBU-PERFORMANCE-MP – multi-port performance oriented setup
* BAMBU-PERFORMANCE – single port performance oriented setup
* BAMBU-AREA-MP – multi-port area oriented setup
* BAMBU-AREA – single-port area oriented setup
* BAMBU – no specific optimizations enabled
– improved code speculation
– improved memory localization
– added option –do-not-expose-globals making possible localization of globals, as it is similarly done by some commercial tools
– added support of high latency memories and of distributed memories: zero, one and two delays memories are supported
– added option –aligned-access to drive the memory allocation towards more simple block RAM models: it can be used under some restricted assumptions (e.g., no vectorization and no structs used)
– ported the GCC algorithm which rewrites a division by a constant in adds and shifts
– added option –hls-div that maps integer divisions and modulus on a C based implementation of the Newton-Raphson algorithm
– improved technology libraries management:
* technology libraries and contraints are now managed in a independent way
* multiple technology libraries can be provided to the tool at the same time
– improved and parallelized PandA test regression infrastructure
– added support to Centos7, fedora 21, Ubuntu 14.04 and Ubuntu 14.10 distributions
– complete refactoring of output messages

Problems fixed:
– fixed problem related to Bison 2.7
– fixed reinstallation of PandA in a different folder
– fixed installation problems on systems where boost and gcc are not installed in default locations
– removed some implicit conversions from generated verilog circuits

For any information or bug report, please write to panda-info@elet.polimi.it or to

PandA 0.9.2 released

PandA 0.9.2 new features introduced:
– added an initial support to GCC 4.9.0,
– stable support to GCC versions: v4.5, v4.6, v4.7 (default) and v4.8,
– added an experimental support to Verilator simulator,
– new dataflow dependency analysis for LOADs and STOREs; we now use GCC alias analysis to see if a LOAD and STORE pair or a STORE and STORE pair may conflict,
– added a frontend step that optimizes PHI nodes,
– added a frontend step that performs conditionally if conversions,
– added a frontend step that performs simple code motions,
– added a frontend step that converts if chains in a single multi-if control construct,
– added a frontend step that simplifies short circuits based control constructs,
– added a proxy-based approach to the LOADs/STOREs of statically resolved pointers,
– improved EBR inference for Lattice based devices,
– now, memory models are different for Lattice, Altera, Virtex5 and Virtex6-7 based devices,
– updated FloPoCo to a more recent version,
– now, register allocation maps storage values on registers without write enable when possible,
– added support to CentOS/Scientific Linux distributions,
– added support to ArchLinux distribution,
– added support to Ubuntu 13.10 distribution,
– now, testbenches accept a user defined error for float based computations; the error is specified in ULPs units; a Unit in the Last Place is the spacing between floating-point numbers,
– improved architectural timing model,
– added a very simple symbolic estimator of number of cycles taken by a function, it mainly covers function without loops and without unbounded operations,
– general refactoring of automatic HLS testbench generation,
– added support to libm function lceil and lceilf,
– added skip-pipe-parameter option to bambu; it is is used to select a faster pipelined unit (xilinx devices have the default equal to 1 while lattice and altera devices have the default equal to 0),
– improved memory allocation when byte-enabled write memories are not needed,
– added support to variable lenght arrays,
– added support to memalign builtin,
– added EXT_PIPELINED_BRAM memory allocation policy, bambu with this option assumes that a block ram memory is allocated outside the core (LOAD with II=1 and latency=2 and STORE with latency=1),
– added 2D matrix multiplication examples for integers and single precision floating point numbers,
– added some synthesis scripts checking bambu quality of results w.r.t. 72 single precision libm functions (e.g., sqrtf, sinf, expf, tanhf, etc.),
– added spider tool to automatically generate latex tables from bambu synthesis results,
– moved all the dot generated files into directory HLS_output/dot/. Such files (scheduling, FSM, CFG, DFG, etc) are now generated when –print-dot option is passed,
– VIVADO is now the default backend flow for Zynq based devices.

Problems fixed:
– fixed all the Bison related compilation problems,
– fixed some problems with testbench generation of 2D arrays,
– fixed configuration scripts for manually installed Boost libraries; now, we need at least Boost 1.48.0,
– fixed some problems with C pretty-printing of the internal IR,
– fixed some ISE/Vivado synthesis problems when System Verilog based model are used,
– fixed some problems with –soft-float based synthesis,
– fixed RTL-backend configuration scripts looking for tools (e.g., ISE, Vivado, Quartus and Diamond) already installed,
– fixed some problems with real-to-int and int-to-real conversions, added some explicit tests to the panda regressions.

For any information or bug report, please write to panda-info@elet.polimi.it

PandA 0.9.1 released

PandA 0.9.1 new features introduced:
– complete support of CHSTone benchmarks synthesis and verification (http://www.ertl.jp/chstone/),
– better support of multi-ported memories,
– local memory exploitation,
– read-only-memory exploitation,
– support of multi-bus for parallel memory accesses,
– support of unaligned memory accesses,
– better support of pipelined resources,
– improved module binding algorithms (e.g., slack-based module binding),
– support of resource constraints through user xml file,
– support of libc primitives: memcpy, memcmp, memset and memmove,
– better support of printf primitive for RTL debugging purposes,
– support of dynamic memory allocation,
– synthesis of libm builtin primitives such as sin, cos, acosh, etc,
– better integration with FloPoCo library (http://flopoco.gforge.inria.fr/),
– soft-float based HW synthesis,
– support of Vivado Xilinx backend,
– support of Diamond Lattice backend,
– support of XSIM Xilinx simulator,
– synthesis and testbench generation of WISHBONE B4 Compliant Accelerators (see http://cdn.opencores.org/downloads/wbspec_b4.pdf for details on the WISHBONE specification),
– synthesis of AXI4LITE Compliant Accelerators (experimental),
– inclusion of GCC regression tests to test HLS synthesis (tested HLS synthesis and RTL simulation),
– inclusion of libm regression tests to test HLS synthesis of libm (tested HLS synthesis and RTL simulation),
– support of multiple versions of GCC compiler: v4.5, v4.6 and v4.7.
– support of GCC vectorizing capability (experimental).

For any information or bug report, please write to panda-info@elet.polimi.it