Friday, December 21, 2012

Sobel Filter Application on the Xilinx Zynq Zedboard

Here are the steps for getting the Sobel Filter application running on the Zynq Zedboard using a webcam for the input data stream while the output frame is shown in the HDMI display. The Sobel Filter is implemented as a hardware accelerator on the fpga as well as an openCV software implementation on the arm processor.



Download Link:
Git Link:
UBoot File used:

Steps to get the application running on the board:

1. Connect Power, HDMI display, USB HUB with keyboard, mouse and webcam (I used a Logitech webcam) to the Zedboard.

2. Format the SD Card using the instructions as given in the link below:
http://wiki.analog.com/resources/tools-software/linux-drivers/platforms/zynq#enable_xf86-video-modesetting_xorg_driver
The first partition needs to have the  boot.bin,  zImage,  and dts files. These can be found on the SD_card folder in the zip file. The second partition needs to have the Linaro file system (also found at the above link).

3. Set mode on the Zedboard to SD card boot mode.

4. Power on and boot into linux.

5. Install OpenCv by using the command "sudo apt-get install libopencv-dev" (sudo password is linaro).

6. Connect an ethernet cable and copy folder test_app in the zipfile to the filesystem.
(cf_adv7511_zed\workspace\test_app).

7. Run command "cmake ."

8. Run "make"

9. Run "sudo ./camera"

10. It will display the original webcam stream and you can enable hardware sobel filtering and software sobel filtering using "h" and "s" respectively. Original stream can be enabled with "o". Press escape to exit anytime.

Note. This design is not fully optimized  as there is a significant communication bottleneck. This bottleneck is caused while copying the video stream buffer to the DMA region and back, as well as due to the slow latency of the DDR. For the first problem, I am looking into upgrading the kernel to the 3.8 version to use the DMA buffer sharing options. For the second problem, I'm working on using the Accelerator Coherency Port, as it provides a low latency path directly from the cache to the accelerator.

An EDK project is also shared with a bare metal application to test the hardware without the linux kernel.

This reference design was made available from the PARSE Research group at TU/e.

References-
http://wiki.analog.com/resources/tools-software/linux-drivers/platforms/zynq#enable_xf86-video-modesetting_xorg_driver
http://wiki.xilinx.com/zynq-base-trd-14-3
http://ez.analog.com/message/70323#70323

Tuesday, November 13, 2012

Approximate Computing 2

A reading list for Approximate Computing /Soft Computing/ Imprecise Computing, Stochastic Computation

A NumericalOptimization-based Methodology forApplication Robustification: Transforming Applicationsfor Error Tolerance

Joseph Sloan, John Sartori, and Rakesh Kumar. " Exploiting Application-Level Error Tolerance in Software Design for Stochastic Processors. " In the 49th Design and Automation Conference. DAC, San Francisco, June 2012. (PDF). (invited)

Best-effort semantic document search on GPUs

Sunday, November 4, 2012

Database of integer sequences

http://oeis.org/

Approximate Computing

A reading list for Approximate Computing /Soft Computing/ Imprecise Computing, Stochastic Computation

Joseph Bates: http://web.media.mit.edu/~bates/Welcome_files/BatesPosterOct2010.pdf
http://web.media.mit.edu/~bates/Summary_files/BatesTalk.pdf

http://en.wikipedia.org/wiki/Bate%27s_chip

Doug Burger - Microsoft
http://research.ihost.com/weed2012/pdfs/paper%20G.pdf
http://sampa.cs.washington.edu/public/uploads/b/bc/Npu-dasi12.pdf
http://www.cs.utexas.edu/~hadi/doc/paper/2012-asplos-truffle.pdf
http://homes.cs.washington.edu/~asampson/media/papers/enerj-pldi2011.pdf

CMU
http://www.cs.cmu.edu/~gpekhime/Projects/15740/paper.pdf

IBM:
http://www.research.cornell.edu/KIC/events/Computing2008/Nair_Cornell_08.pdf


Darpa
http://csl.stanford.edu/~christos/publications/2012.isat.ACSWTP.slides.pdf

University of Illinois at Urbana-Champaign - Naresh R. Shanbhag
http://passat.crhc.illinois.edu/rakeshk/dac_10_invited_cam.pdf
ANT
http://www.eurasip.org/Proceedings/Eusipco/Eusipco2000/SESSIONS/THUPM/SS3/CR1825.PDF
Stochastic Computing: Embracing Errors in Architecture and Design of Processors and Applications

Energy Aware Probabilistic Multiplier
Energy-aware probabilistic multiplier: design and analysis, MSK Lau et al, Proceedings of
the 2009 international conference on Compilers, architecture, and synthesis for embedded
systems

Power Benefits of Imprecise ComputerArithmetic

IMPACT: IMPrecise adders for low-power approximate computing 

Interesting Summary from above http://csl.stanford.edu/~christos/publications/2012.isat.ACSWTP.slides.pdf
1. Exact output with approximate HW
2. Approximate output with deterministic HW (unsound SW transformations)
3. Approximate output with approximate HW

An interesting keypoint:
statistical nature of application-level has to match statistical nature of underlying device!


Survey of Stochastic Computing
Book -Stochastic Computing
http://www.nowpublishers.com/product.aspx?product=EDA&doi=1000000021&section=xstart

Motivation for Neuromorphic computing systems in stochastic computing


Monday, October 29, 2012

Interesting Papers from DASIP Conference on Design and Architectures for Signal and Image Processing

Implementing Large-Kernel 2-D Filters using Impulse Codeveloper
Carlos Colodro, Javier Toledo, Jose Javier Martinez, Javier Garrigos and Jose Manuel Ferrandez
(Universidad Politecnica de Cartagena - UPCT)


Approach to the Out-of-Order Arrival of Frames in Video Streaming Applications on
Clustered MPSoC
Daniela Genius and Khouloud Zine El Abidine (LIP6)


A Nature-inspired Adaptive Floating-point Co-processing System
Carlo Sau, Danilo Pani, Francesca Palumbo, and Luigi Raffo (University of Cagliari)


An Experimental Toolchain Based on High-level Dataflow Models of Computation for Heterogeneous
MPSOC
Julien Heulot, Karol Desnos, Jean-Francois Nezan, Maxime Pelcat, Mickael Raulet
(INSA, IETR, UMR 6164, UEB), Hervé Yviquel (IRISA, University of Rennes 1)
Jean-Christophe Le Lann and Pierre-Laurent Lagalaye (Modaë Technologies)


Foreground Detection and Image Segmentation in a Flexible ASVP Platform for FPGAs
Roman Bartosinski, Martin Danek, Jaroslav Sykora, Lukas Kohout (UTIA AV CR,v.v.i.),
and Petr Honzik (CIP plus s.r.o.)


High-Speed Camera with Embedded FPGA Processing
Uros Stevanovic, Michele Caselle, Suren Chilingaryan, Armin Herth, Andreas Kopmann,
Matthias Vogelgesang, Matthias Balzer and Marc Weber (KIT, IPE)


MEMSCOPT: A Source-to-Source Compiler for Dynamic Code Analysis and Loop Transformations
Grigoris Dimitroulakos, Christakis Lezos and Konstantinos Masselos (University of Peloponnese)


Parallel Video-Based Traffic Sign Recognition on the Intel SCC Many-core Platform
Jan Micha Borrmann, Alexander Viehl (Forschungszentrum Informatik Karlsruhe),
Oliver Bringmann and Wolfgang Rosenstiel (Universität Tübingen)


HLS-based Fast Design Space Exploration of ad hoc Hardware Accelerators: a Key Tool for MPSoC
Synthesis on FPGA
Youenn Corre, Hoang Van-Trinh, Jean-Philippe Diguet, Dominique Heller and Loic Lagadec
(Lab-STICC, University of South-Brittany)


Architectural Decomposition of Video Decoders for Many Core Architectures
Henryk Richter and Volker Kühn (University of Rostock), and Benno Stabernack (Fraunhofer)


Analysis Techniques for Static Dataflow Models with Access Patterns
Kaushik Ravindran, Arkadeb Ghosal, Rhishikesh Limaye, Guoqiang Wang, Guang Yang,
and Hugo Andrade (National Instruments Corporation)
Design Space Exploration Strategies for FPGA Implementation of Signal Processing Systems using CAL Dataflow Program
 

Ab Al-Hadi Ab Rahman, Richard Thavot, Simone Casale Brunet, Endri Bezati and Marco Mattavelli
(EPFL)




Summary of AXI4

http://www.doulos.com/knowhow/arm/Migrating_from_AHB_to_AXI/

Books on Compilers

Optimizing Compilers for Modern Architectures: A Dependence-based Approach [Hardcover] Randy Allen (Author), Ken Kennedy (Author)

Wednesday, September 5, 2012

Intel - Enabling Consistent Platform-Level Services for Tightly Coupled Accelerators

Enabling Consistent Platform-Level Services for Tightly Coupled Accelerators

WORKSHOP: 1st International Workshop on Domain-Specific Multicore Computing

http://iccad.com/2012_event_details?id=149-2-W

list of fpl 2012

Convey Vector Personalities – FPGA Acceleration with an OpenMP-Like Programming Effort?
Björn Meyer, Jörn Schumacher, Christian Plessl and Jens Förstner

PolyBlaze: From One to Many. Bringing the Microblaze Into the Multicore Era with Linux SMP Support
Eric Matthews, Lesley Shannon and Alexandra Fedorova

Automating the Design of MLUT MPSoPC FPGAs in the Cloud
David Andrews, Miaoqing Huang, Azad Fakhari, Eugene Cartwright, Sen Ma, Christina Smith and Jason Agron

    Reconfigurable Multi-Processor Architecture For Streaming Applications
    Leyla S. Ghazanfari, Roberto Airoldi, Jari Nurmi and Tapani Ahonen
   
    High Level Structural Description of Streaming Applications
    Anja Niedermeier, Jan Kuper and Gerard J.M. Smit
   
    Combining Data and Computation Transformations for Fine-Grain Reconfigurable Architectures
    Cristiano B. Oliveira and Eduardo Marques
   
    Raising the Abstraction Level of HDL for Control-Dominant Applications
    Marc-Andre Daigneault and Jean Pierre David
   
    From OpenCL to High-Performance Hardware on FPGAs
    Tomasz S. Czajkowski, Utku Aydonat, Dmitry Denisenko, John Freeman, Michael Kinsner, David Neto, Jason Wong, Peter Yiannacouras and Deshanand P. Singh
   
    PPMC : Hardware Scheduling and Memory Management Support for Multi Accelerators
    Tassadaq Hussain, Miquel Pericàs, Nacho Navarro and Eduard Ayguadé
   
    DWARV 2.0: A CoSy-based C-to-VHDL Hardware Compiler
    Razvan Nane, Vlad-Mihai Sima, Bryan Olivier, Roel Meeuws, Yana Yankova and Koen Bertels
   
    Design Space Exploration for Automatically Generated Cryptographic Hardware Using Functional Languages
    Davy Wolfs , Kris Aerts and Nele Mentens

Saturday, July 21, 2012

Procedings of 2012 Electronic System Level Synthesis Conference

http://www.ecsi.org/eslsyn
http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?reload=true&punumber=6230784

Saturday, July 14, 2012

Touch Screen for Zedboard

http://www.em.avnet.com/en-us/design/drc/Pages/7-inch-ZedBoard-Touch-Display-Kit.aspx

Thursday, July 5, 2012

Embedded Computer Vision Workshop 2012

http://cvisioncentral.com/promotion/evw2012/http://cvisioncentral.com/promotion/evw2012/
http://cvisioncentral.com/wp-content/uploads/2012/04/EVW2012_Flyer.pdf

Monday, July 2, 2012

HLS Discussion

http://www.cadence.com/Community/blogs/ii/archive/2012/06/25/high-level-synthesis-users-productivity-gains-beckon-but-learning-curve-comes-first.aspx

SPACE - SystemC Partitioning of Architectures for Co-design of Embedded Systems

A company doing a Design Space exploration tool Hardware Software Codesign on FPGA
http://www.spacecodesign.com/products.php

Some Slides -
http://microelectronics.esa.int/core/eslday_2011/8_SpaceStudio_SpaceCodesign.pdf

Related Work for MAMPS
http://www.assert-project.net/-TASTE-

Tuesday, June 12, 2012

to read

TOP PICKS FROM THE 2011 COMPUTER ARCHITECTURE CONFERENCES

‘‘What Is Happening to Power, Performance, and Software?’’ by
Hadi Esmaeilzadeh, Ting Cao, Xi Yang, Stephen M. Blackburn,
and Kathryn S. McKinley
‘‘Dark Silicon and the End of Multicore Scaling’’ by Hadi Esmaeilzadeh,
Emily Blem, Rene´e St Amant, Karthikeyan Sankaralingam,
and Doug Burger
‘‘Kilo TM: Hardware Transactional Memory for GPU Architectures’’
by Wilson W.L. Fung, Inderpreet Singh, Andrew Brownsword, and
Tor M. Aamodt

Tuesday, June 5, 2012

DAC 2012 papers

1. Optimizing Memory Hierarchy Allocation with Loop Transformations for High-Level Synthesis
Jason Cong, Peng Zhang, Yi Zou

2. An Efficient Design Approach of Control Logic with the Use of High Level Synthesis for a Video Signal Conversion FPGA Ryo Yamamoto - Mitsubishi Electric Corp., Kamakura, Japan

3.Exploring AES Design Variants with C-to- Gates for FPGA at Gb/s Line Rates Kees Vissers, Fernando Martinez Vallina, Stephen Neuendorffer - Xilinx, Inc., San Jose, CA
Kristof Denolf, Ronny Dewaele - Barco, Kortrijk, Belgium

4. A Methodology for the High-Level Synthesis of Iterative Algorithms Alessandro A. Nacci, Francesco Bruschi, Vincenzo Rana - Politecnico di Milano, Italy

5. Hardware Synthesis of Recursive Functions through Partial Stream Rewriting
Lars Middendorf, Christian Haubelt - Univ. of Rostock, Rostock-Warnemuende, Germany
Christophe Bobda - Univ. of Arkansas, Fayetteville, AR

6. Accelerating Neuromorphic Vision Algorithms for Recognition Vijaykrishnan Narayanan, Ahmed Al Maashri, Michael Debole, Matthew Cotter, Nandhini Chandramoorthy,Yang Xiao - Pennsylvania State Univ., University Park, PA
Chaitali Chakrabarti - Arizona State Univ., Phoenix, AZ

7. A Compiler and Runtime for Heterogeneous Computing Rodric Rabbah, Joshua Auerbach, David F. Bacon, Ioana Burcea, Perry Cheng, Stephen J. Fink, Sunil Shukla -
IBM T.J. Watson Research Ctr., Yorktown Heights, NY

8. Removing Overhead from High-Level Interfaces
Kyle Kelley, Megan Wachs, John P. Stevenson,
Stephen Richardson, Mark Horowitz - Stanford Univ., Stanford, CA

9. Communication-Aware Mapping of KPN Applications onto Heterogeneous MPSoCs
Jeronimo Castrillon, Andreas Tretter, Rainer Leupers,
Gerd Ascheid - RWTH Aachen Univ., Aachen, Germany

10. Unrolling and Retiming of Stream Applications onto Embedded Multicore Processors
Weijia Che, Karam S. Chatha - Arizona State Univ., Tempe, AZ

Thursday, May 3, 2012

Monday, February 27, 2012

FPGA 2012

1. A Performance and Energy Comparison of FPGAs, GPUs,
and Multicores for Sliding-Window Applications

2. Communication visualization for bottleneck detection of high-level synthesis application

3. VirtualRC: A Virtual FPGA Platform for Applications and Tools Portability

Thursday, February 2, 2012

Paper to look out for

VALVED DATAFLOW FOR FPGA MEMORY HIERARCHY SYNTHESIS
http://www.icassp2012.com/Papers/ViewPapers.asp?PaperNum=3058

A. Papakonstantinou, Y. Liang, K. Rupnow, D. Chen, “Automated Design Transformation for High Level Synthesis in FCUDA”,