help: where is "LE" or "LUT" component (or "how do I create my own components?")

Altera_Forum · ‎10-12-2007

I am a long time electronics designer and programmer who has managed to avoid FPGAs for a couple decades --- until now. I am designing a device with a small cyclone3 FPGA (EP3C5F256C8) and just started to create my design with Quartus2v71.

I chose to begin with schematic entry rather than HDL. I chose to start with the CRC32 part of my overall FPGA design, because the rest is less self-contained (but will probably be easier). First I designed an algorithm, wrote a C program to confirm it produces the correct results, then spent quite a bit of time reorganizing it until I found a way to implement it efficiently in hardware (only 2 levels of logic [delay] per clock cycle).

Having already read (okay, some reading, some skimming) the cyclone3 and quartus2 documentation, I drew my schematic partly with in terms of components I expected to find in quartus2. To my shock and horror, the component I assumed would be easiest to find and most certain to exist --- seems to NOT exist. Specifically, I refer to that component called LE (logic-element) or LUT (16x1 lookup table) or LAB (16 LEs). After searching through the Quartus2 design functions/megafunctions dozens of times, I *still* cannot find any of these components. I am still not certain whether it is I who is stupid or crazy, or the designers of these tools (which mostly appear well designed, so "it must be me"). Though I tried to find creative ways to create a LE or LAB, I failed. For example, I tried to create a LE/LAB by creating a 16x1 ROM - but quartus2 assigned an entire M9k to simulate one utterly straightforward LE!!! Brilliant!!! Just not sure whether this refers to me or quartus2.

The first question is, "so what?". Just create the schematic with the components provided and quit being a trouble maker. Oh, okay. So I split all those "logical components" into the "components provided by quartus2" (where "logical component" refers to any component that BOTH performs a logical function in my design AND can be implemented with one LE/LAB [per bit]). The result? The schematic that fit on one visible screen exploded into a monster 3 screens wide by 3 screens high! Furthermore, what was very neat and logical and comprehensible became an "unwieldy mess" - even though I tried very hard to prevent that from happening.

This experience leaves me frustrated, so far. I figured "there must be a way for me to 'create components' --- essentially my own megafunctions --- which I could then insert into my original schematic". I still have a feeling quartus2 will let me do that, but so far I have not figured out how to do this.

This message is not just to make you smile - by reminding you what it was like back when you first started designing FPGAs - though it may perform that function too. :o Rather, I hope someone will tell me where to find a LE/LUT/LAB component - to solve my problem. Or alternatively, tell me how to turn the components in my graph-paper schematics into components I can then insert into "readable" schematics --- the way I would repeatedly insert function calls into a program instead of repeatedly cut-and-paste big inline chunks of code all over my program. I keep thinking quartus2 must provide a way to create components like those in the quartus2 drop-down lists of components/functions, but somehow my reading/skimming always "misses it". So somebody tell me, "where is it"?

Some people always demand examples (before they will help), so here is a typical one.

In my hand-drawn CRC32 schematic, I have a component that is essentially a modified (special-purpose) "registered 2-to-1 multiplexer". That's what each LE would be, though the component on my schematic performs the same function on 32-bits (ie, two LABs):

inputs = A, B (the two inputs to the multiplexer)

inputs = S1, S2 (two selection inputs to the multiplexer)

inputs = CLK (capture multiplexer output on positive edge)

The CLK input cycles once as each byte is captured by the logic, and drives most of the CRC32 circuit. The clock also toggles a binary up counter, which starts at 0x0000 and increments after each input byte is captured. The 2 low bits of this binary counter are fed into the component as S1, S2.

When S1,S2 = 0 or 1 or 2, the component simply passes input A to the output and captures it on the rising edge of CLK. When S1,S2 = 3, the component passes (A ^ B) to the output and captures that on the rising edge of CLK (where ^ means XOR).

Simple. All this maps perfectly into one LE per bit (2 LABs for the 32-bits in my design). And frankly, I have no hope my CRC32 algorithm will compute the CRC32 at the minimum speed I can tolerate (100MB/sec = 100MHz CLK rate) unless the logic is packed efficiently into LEs the way I designed it (2 delays, one of which is asynchronous (flow-through) LE, and the other of which is to read one value from one M9k memory block configured as a 256 x 32 ROM).

The above is pretty typical of the components I found convenient to put in my hand-drawn graph-paper schematics. I will admit I prefer to "know what I am doing", which is contrary to modern modus-operandi (namely "build upon the work of others", which I cynically refer to as "blindly compound the mistakes of others"). You might claim I apply the xfiles "trust no one" philosophy to my work. I happily admit this, and my only regrets have been those times I got lazy and suckered for the modern lazy "dark side" notion that I can produce good devices while not understanding huge parts of my designs. Thus, I have the very unpleasant feeling the answer to this post is, "why on earth would you want to design with the components actually IN the devices? Wouldn't you rather design with expensive black boxes that you won't be blamed for when they malfunction?". My reply is, "no, but you need not agree with me, and you need not design my way".

However, since I do know what the fundamental components in the cyclone3 are, why on earth make it difficult (or impossible) to design with them? Every time an engineer designs with LE components, quartus2 need not grunt to figure out how to combine the scattered bits and pieces of a huge design, and quartus2 also need not wonder which parts of a design are important to "make efficient" (already made clear by the design[er]).

Okay, I am prepared to be laughed at. After all, I'm a total newbie to FPGAs & Quartus. But please include the "why" for your advice/excuses/alternatives, so even this silly, hopelessly stupid, retrograde neandrathal can understand. Or --- you could just add LE & LAB to Quartus2 and pretend it was your idea! :rolleyes: I don't mind, as long as my designs get easier to create! Maybe I'll even stop avoiding FPGAs.

Thanks in advance for all helpful responses.

Altera_Forum · ‎10-12-2007

Hello Bootstrap,

I feel your pain. I am pretty new to this FPGA stuff also, and it is amazing that a program that is supposed to make your life so much easier when it comes to designing something that "you" need, can do the exact opposite. I also have been working on a design, that would work perfectly inside of a LUT, but also had no luck finding one in Quartus. I'd be interested to see if anybody actually helps you, because I really can't. I do know, that if you can get your hands on a quartus "front" program, such as DSP Builder (which uses Matlab's Simulink), they have LUT's that can be somehow implemented. You loose a lot of control over your design though, which is why I am now just using Quartus, which, as you know, has plenty of issues of its own.

Thanks for the story, and sorry I can't be of any help.

Dave

Altera_Forum · ‎10-12-2007

For software in a language like C, the compiler creates the machine code. The compiler gives you the ability to insert some assembly statements where you really need low-level control. Similarly, FPGA compilers start with HDL or schematics to create the LUTs, registers, LABs, etc. while giving you some low-level control for those situations when you really need it.

If you insert an LCELL primitive, that will tell Quartus to make the combinational signal at the LCELL primitive be the output of a LUT. If you limit the logic in front of the LCELL primitive to what will fit in a single LUT, then you can essentially create each LUT yourself.

The Fitter will combine LUTs with registers to create LEs or ALMs. To tell the Fitter which LEs or ALMs to put in one LAB, use a LogicLock region that is one LAB in size. To tell the Fitter to put LEs or ALMs into a particular LAB, you can use a LAB location assignment.

It is rare to need to design at this low level. Most designs don't require it, and those that do usually require it for only a tiny portion of the design.

First try describing your design at a higher level of abstraction and giving Quartus complete timing constraints. If you don't get the area or timing results you need after doing all the normal things first to help Quartus optimize the design, then you can tweak portions of the design that need manual control.

Altera_Forum · ‎10-13-2007

Brad. This reply slightly exceeds the 10000 character limit, so I most post two messages.

----- first half of message -----

I inserted an LCELL into my schematic, but it has only one input and one output, so the LCELL certainly is not a LE or LUT or LAB component itself. I read all the help about LCELL in the integrated Quartus2 help system, but do not understand how it helps me.

I sorta more-or-less understand your description of LogicLock, I think. However, to the extent I do understand what you say about LogicLock, it only solves half my problem, and not the most annoying half. I refer to the "exploding schematics" problem, where nice neat compact schematics explode into enormous badly organized tangled messes.

If I understand your description of LogicLock, I can effectively draw a boundary around a group of components on my huge messy outta-control schematic --- and tell Quartus to treat this like a single component or minimum number of components to the degree possible (obviously wires come in and go out).

Perhaps this encourages Quartus2 to merge components within the LogicLock domain to the degree possible before it performs the general merge attempts across a whole design. This does increase the probability the components the designer wants together are in fact put together - but I very much doubt it can assure this. Are you sure Quartus2 will combine the three elements of my example into one LE/LAB --- the 16-bit XOR functionality with the 4-to-1 multiplexer functionality with the output register functionality? Perhaps, but even though I am a total newbie FPGA designer, I designed several CPUs and other large devices (yes, from hundreds of gates and simple MSI like muxes), so I tend to give weight to my impression that my entire design will be impractical because the speed of this part of the design is insufficient --- EXACTLY BECAUSE --- these three components are not combined in the way I require. And I can tell you, with total certainty, that designing with FPGAs shall meet an instantaneous, unfortunate, unnecessary and possibly permanent end at our product development organization *if* that happens and *if* I cannot find a way to force Quartus2 to do what I *should* be able to tell it in the first place (make these components into one LE or LAB (one LE per bit in each bus involved)).

Please, do not read an arrogant tone into the above. I am simply trying to describe what must happen in this particular case. We have an FPGA that (I am very confident) is capable and fast-enough to do the work we require of it, but we seem to have found we are in danger of being told our design is "not fast enough" [to run in our chosen part]. The problem with that is - I have checked prices, and we cannot afford the cost of the next fastest version of the EP3C5F256 chip (and certainly not the fastest version).

Therefore, I still believe I may have a valid point. That point is, essentially, that Altera will lose some smallish but non-trivial percentage of FPGA sales because of one small, trivial-to-implement but extremely fundamental feature is missing from Quartus2. Altera could probably implement the solution in a few different ways, but *certainly* the most friendly way is the following:

Proposal: Make the following change to Quartus2 (at least in the schematic entry mode): Add the following components: LE4, [LAB]. The functionality of the LE4 is exactly equal to the current standard LEs (logic elements) that comprise the vast majority of current FPGA chips (cyclone3 for example). The schematic package looks like this:

inputs: A,B,C,D - these are variable mix of data-inputs and control-inputs

inputs: CLK - captures the output of the LUT on rising edge

inputs: ENA - enables the CLK signal

inputs: REG - selects registered output (otherwise the LUT output leaves the LE)

inputs: ACL - asynchronous clear (only if definitely always available in every LE situation)

inputs: ??? - any other inputs that are practical to [optionally/sometimes] support

outputs: OUT - direct output of LUT or registered output of LUT

mode: any practical mode settings would be set in a dialog box, presumably to tell the LE to reconfigure the LUT from the normal (4:1) mode to the (2x 3:1 + 2:1) mode and so forth (to handle less-common functionality such as carry-in / carry-out, shift-in / shift-out, and so forth).

Note: If any less-common functionality requires excessive kludge or confusion to this LE4 component, the appropriate response is to omit that functionality - do not fail to implement the LE4 and LAB just because some minor functionality is not convenient to represent.

I do not see the LAB component as an actual component, really. I just say the term LAB to imply the Quartus2 should happily create and/or merge side-by-side identical LE4s, in exactly the way Quartus2 does for many other components. If Quartus2 does not support a way to merge any arbitrary number of identical LE4s into a single package/component, then at least support groups of 4, 8, 16, 32 [and 64].

----- message continued in separate posting -----

Altera_Forum · ‎10-13-2007

---- second half of message -----

You said, "If you limit the logic in front of the LCELL primitive to what will fit in a single LUT, then you can essentially create each LUT yourself.". I worry this will not suffice. For example, consider the component I described in my original message. To implement this with components available in Quartus2 would end up having 3~4 levels of components in a serial = sequential configuration.

1: compute two values: (A ^ B) and (A = unmodified)

1a: output AXORB = A ^ B

1b: output A = A

2: multiplexer selects one of two input values based upon low 2 bits of xcounter

2a: xcounter = 0 or 1 or 2 ::: route signal A to output

2b: xcounter = 3 ::: route signal AXORB to output

3: register computed result

3a: output of multiplexer routed to input of register

3b: when CLK rises, capture value at register input pin and output from register

4: LCELL

4a: output of register routed to input of LCELL

4b: input of LCELL routed to output of LCELL

Now, I understand that LCELL may in fact be a fictional device for Quartus2, and therefore hopefully is not an real LE/LUT/buffer/register/anything on the FPGA. But, assuming that is true, what LCELL "sees" feeding it is - *one register* --- THAT IS ALL.

You see, this is one major problem with the design of Quartus2 (in my not-humble-enough but admittedly newbie opinion). I would have included an optional "registered output" on every single component. After all, the registered output is in every single LE, and in the synchronous designs we want to [and almost need to] design with, the need for registered output is very, very common. Instead, Quartus2 forces us to add hundreds of 8ffxx or 74374 packages all over out schematics, which bloats and confuses them horribly.

You see the problem? How many levels back does the LCELL "look" for opportunities to merge levels together? Even if the answer is "many levels", it is not always possible to find components that allow the designer to express the functionality in ways Quartus2 will be able to detect the merge possibilities. Give us LE4 components and "problem solved".

You say that the largest part of designs do not need to be designed this "low level way". True enough - sometimes the components offered express the design reasonably well. Some sections of an FPGA run much faster than required, so inefficient implementation of LEs has no consequences (unless you happen to ooze beyond the size of your FPGA). However, only the smallest trivial designs are not massively expanded into huge chaotic mess of utterly pointless components. I mean really, why fill up tons of space with 4 more 74374 components --- when you could have run a CLK line into the logic that feeds them? This single monstrous bloatocracy is repeated *endlessly* in Quartus2 schmatics - even though the output of every LE in the FPGA can be registered or not. I am painfully aware of my status is "total newbie" (almost "newborn" you might say) and therefore any claim of "stupid design" from my fingertips is likely to be smashed flat by heavy sledgehammers of experience. Nonetheless, I will take the risk and venture to provisionally call this omission "absurd" (and save "stupid" for next time - hopefully not). But understand this. My only purpose of "tweaking" someone with opinions like "absurd" --- is to get myself squashed to atomic scale when some expert (?Quartus2 designer?) shows me how to do what I need, or tells me why my desired approach is even dumber than stupid. Either way, then I'll know how to move forward efficiently (with or without FPGAs).

I am not sure how to answer your last two paragraphs concisely or comprehensibly. I freely admit to being on the "fringe" in terms of how I design hardware and software. Specifically, I *do* "sweat the details" more than most designers. Every time I try the conventional (lazy) ways that other developers advocate, I get trashed one way or another. I have learned, from long, hard, nasty, painful experience to "trust no one". Almost every time I tried to avoid understanding or implementing the low-level internal details, I was terminally shafted --- I had to either "abandon project" or "start all over" or at least "ditch their high-level component and implement everything myself". And this gets worse every year, as more "implementors/sellers" adopt the bogus/corrupt/hostile philosophy (rationalization) of "let the buyer beware". They mislead us to believe their product will do everything you want and more, and do so better and faster than you can even imagine. It has gotten so bad (slowly, so people just accept it as "normal") that I see many design engineers and groups fail time after time after time, while I succeed time after time after time. Yet they never stop telling me how stupid I am for insisting that I must understand every detail of every nook and cranny of my designs, and adopting what I call me "all things considered" approach to product design. They honestly admit their amazement that I finish projects so fast and so well, but when I ask how that is consistent with their endless claims I do everything in stupid and inefficient ways, they just shrug.

This is my way of saying, I honestly don't know what you propose. When you say I should take a high-level view, do you mean I should draw a big rectangle with inputs and outputs and label it "compute CRC32"? If so, I have done exactly that --- but that is my block diagram. Unless Quartus2 has achieved human-level++ consciousness, I am quite sure it cannot convert this block into a complete working interconnect of LEs!!! Or can it?

And if I put "timing constraints" on the inputs and ouputs of this block diagram, how does this help Quartus2 know what logic belongs inside the block? I am missing something. Whenever I have first conversations like this with other designers, we always find each other talking from another dimension in the twilight zone (or at least "on another page").

BTW, in my view, what I propose is trying to move TOWARDS the higher-level. Is that not true - in your opinion? Consider this. If I combine several elements (XOR, multiplexer, register, etc) into one single coherent logical functional entity --- is this not exactly what you refer to as "higher level"? If so, this is exactly what I wish to do, what I am trying to find ways to do in Quartus2 - to design at the higher "logical" level. It seems like I should be able to do this, and I still suspect some Quartus2 expert will jump in and tell me how it can be done, and where to find the magic pulldown menus.

Are you listening, sir Quartus2 developer :cool: sir? Please help this neandrathal!

----- the end -----

Altera_Forum · ‎10-13-2007

By higher-level, Brad is talking about Verilog or VHDL, where you would describe the functionality of the design in a specific syntax, somewhat like C for SW code. You said you chose schematic over HDL... Instead of creating a schematic that includes all the components, FPGA designers typically use these Hardware Description Languages to describe the functionality. So your function would contain things like a input signal A, input signal B, and a signal that = a XOR B. Then you have an if statment, saying something like "if xcounter is 0 or 1 or 2 then result =A", and so on. The software will create the look-up tables to implement the function you described. You'd also have an input clock and an output for the registered result. Then you define that at the rising edge of a clock event, result = reg_result, and the software will take care of creating the register for that function. Then output_pin = reg_result. Obviously I'm not giving you the syntax, but I think it wouldn't be too many lines of HDL code...

The idea is that the synthesis tool, Quartus II in this case, can determine how to map the code into the target device. And you don't have to worry about exactly how many LEs/LUTs/registers etc are required or how many levels of logic. Also you can change device easily. For example, you were creating 4-input LUTs. But the Stratix II and III devices, for high-performance and much larger designs, use 6-input LUTs. So your 4-LUT would not be the most efficient for those devices. By creating general code that is not architecture dependent, it makes your design much more portable. There are things you can do to make your design architecture-friendly, and it's great that you are thinking about the target device when you create your design, but the tools don't expect people to go down to your level of detail.

I can understand wanting to have total control over it, it's kind of a natural instinct for us techie types. It works OK for a small design but if you had a huge logic function implementing say, an embedded microprocessor with a memory controller and some DSP processing and who knows what else, it would be far too difficult to implement at the very low architecture level.

When you describe the functionality instead of instantiating the components, you can then focus on a higher level to ensure that the register-to-register paths occur in a delay that allows your clock to run at the right frequency. That's what the timing assignments do.

You will have to "trust" the tool to a certain extent to create the right logic from your code. But the tools are pretty good these days! Sure, from time to time you may encounter something that you think you could have done better by hand, as Brad suggested. The tools do provide ways to optimize designs, and make low-level changes to the netlist, but that's getting pretty advanced for your first FPGA design.

For some examples of VHDL and Verilog code that describe functions like memory, multipliers, multiplexers, and state machines, see the Recommended HDL Coding Styles at http://www.altera.com/literature/hb/qts/qts_qii51007.pdf. If you're going to give it a try, a basic Verilog text book would probably be a good start. There are some Templates in the Quartus II text editor if you right-click within a text file in the GUI.

Now, all that being said, since you have already done the work to split this design into look-up-table functions, there are some Altera primitives provided to help you specify LUT inputs within HDL code. I'm not sure if they will do everything that you want to do, but maybe you can take a look at the docs to see. Designing With Low-Level Primitives User Guide at http://www.altera.com/literature/ug/ug_low_level.pdf. There is some info about instantiating registers (which is not too hard, and you should have found DFF primitives in the schematic libraries already). Then check out the Look-Up Table Buffer Primitives on page 13 for specifying LUT functions. The idea is that you could use HDL to connect up the LUT buffers and registers that you need, if you want to create a very specific function at a low level.

Not sure if that helps... Hope it at least makes you feel less like you're in the twilight zone!

Altera_Forum · ‎10-14-2007

Thanks for your ideas and comments, Stevie (and Brad).

I downloaded, printed and read the document about "primitives" that you mentioned. Though I still feel like I was reading greek-to-me, I nonetheless believe I correctly understood one aspect of the document. And that is, the conceptual and design philosophy appears to be exactly backwards [from what I want to see]!

Specifically, I keep reading how these various pieces impose/create/force "divisions" or "boundaries" in the design. Therefore, they appear to be an appropriate design tool for developers who want to be able to say "now listen, Quartus2, you must not combine anything before this LUT_INPUT or after this LUT_OUTPUT into a LE/LUT/LAB".

This is exactly 100% opposite of what I want. I want to say, "now listen, Quartus2, you must combine all the following logic elements into one LE/LAB" (unless I wrongly assume you can implement this functionality in one LE/LAB - in which case you should generate an error to inform me of my ignorance of the nature of LEs or LABs).

To me, the neandrathal designer, the ability to force a set of logic and register-capture to combine into one physical device element is infinitely more important than the opposite that all the components and techniques I see in the documentation (and you point me to). Why on earth would I want to prevent logic from being combined together to become faster and smaller? Perhaps I might want to prevent Quartus2 from combining *this* set of logic together with *wrong* sets of logic. But these "boundary" components and techniques prevent "this" set of logic from merging with *any* other logic, including the logic I want or need it to combine with. Sorry, but I consider that a perposterous solution to my problem, which is "how do I make sure these sets of logic become combined into one unit, the way I want them to?".

I can think of one possible reason that I very much hope is not true (but I think I'm beginning to smell a "rat" of some kind). Altera thinks that we [most sucker engineers] will pay for whatever speed part Quartus2 tells us we must. And they know one way to artificially increase the number of designs that require faster parts is - to limit the number of ways designers can make sure the critical parts of their designs are implemented efficiently. And "efficiently" in FPGAs means "carefully organize the circuit design so the critical speed path is implemented in a way that can be expressed in the minimum number of levels of LE/LABs". At least that's what this newborn neandrathal believes he knows after his first questioning observations of the world of [altera cyclone3] FPGAs.

As a long time hardware and software engineer, the notion "find your critical path and spend as much time as necessary to find the fastest possible way to implement that" is as natural as breathing. When I wrote a 3D graphics (game/simulation) engine recently, I noticed the "transform vertex positions to [world (or display)] coordinates" was the most often executed "critical path" in the engine. All the lazy numbhead engineers said the same thing, "you cannot possibly do better than the [OpenGL/DirectX] routines - they have been optimized by the smartest geniuses in the universe". Stubborn doofus that I am, I proceeded to break my butt to totally, clearly understand what needed to be done, and what would be the most efficient data-organization, algorithm-organization and implementation to implement this part of the algorithm. So I ended up coding one 32-bit and one 64-bit set of SIMD/SSE2+ assembly language code to perform this functionality and the result was... [drum roll, please]... my entire program became 6+ times faster. This did not shock me. But, as usual, everyone else who had warned me not to waste my time and effort, did what they always do - they just shrugged, said nothing, and walked away.

Look, I've "been there done that" too many times to be uncertain about this, or give in. The notion of "identify and optimize the critical path" is not my invention or discovery. This has been common knowledge for decades if not centuries (or millennia).

Therefore, until someone explains to me why optimization of our the critical paths in our designs have been purposely blocked, I will provisionally assume the answer is not an innocent one (because it rarely is). Yes, I am in a [seeming if not actual] difficult position here, being an admitted newbie neandrathal in FPGA design. But I have designed several 16 to 64 bit CPUs (from scratch, mostly with gates and/or modest MSI), so I have some experience and some basis for being skeptical. But at least I very much *want* to be shown that critical path optimization has not been blocked. Because I have work to do, and I'd rather not eject all FPGAs out the airlock. I do have alternatives, and they are cheaper and more familiar to me (albeit larger on PCB).

BTW, the LUT_INPUT and LUT_OUTPUT do not appear to be components available in any of the schematic pulldown menus (though the LCELL is).

One final comment. A not perfect but certainly reasonable way to provide the functionality I want might be available - but I do not understand Quartus2 well enough to know. The following is what I refer to (tell me if this is possible).

If I could make a separate schematic that contains nothing but the elements of the logical component that I need (and that fits into one LE per bit) --- and then have that "synthetic component" available to insert (as a "black-box" or "functional block diagram") in other schematics, the capability and functionality I seek might be available (in practice). While this does not *force* Quartus2 to implement the innards of the "synthetic component" in one LE, this formulation certainly delivers a very strong hint to Quartus2 to try to implement the synthetic component in as few LEs as possible. Can this be done? If not, why on earth not?

PS: I am aware that most FPGA designers write HDL. I am not sure whether this is because many FPGA tools do not support schematic input, or why. I am certainly comfortable with writing code (microcode, assembly, C/C++, etc), but every hardware design/device I have ever created has been with schematic. Personally, I find this better because the ability to visually observe the whole design (then visually drill-down into any subsection) emphasizes the key difference from software --- the entire schematic is "executing" simultaneously all the time. This makes schematic entry for FPGAs much more natural for me. I looked into verilog for awhile after reading your posts, but got grossed out at the way it is formulated (and even more so with VHDL, over all). Maybe I just looked at horrible examples, I dunno. However, the exact same issue exists - I damn well expect to be allowed to identify and carefully optimize my critical paths. To tell me, a CPU designer, that my attitude about "critical paths" is unreasonable --- well --- you don't want to hear my response to that! IMHO. :-) I do appreciate all honest responses, no matter how opposite they may be. Who knows, for some reason I cannot envision, you may be right (for FPGAs). But with me, you really need to prove it. Sorry about that. :o

PSS: No, I already know this device cannot afford anything slower than, or more expensive than, the slowest version of the smallest cyclone3. So no, I cannot just let Quartus2 implement my design in some inefficient way - and then steal our money several million times over. This is not a game or educational experience for me. This is serious product design, that must be as competitive as possible (especially these two products). And to answer your comment about "trust", no I cannot "trust" Quartus2. Given my experience with most modern american companies, I know better than that. I will, given positive experience with a company for a couple years (at least), tend to give benefit of doubt (suspect myself rather than them, at first). I just know too much, have experienced and seen too much, understand too well what is modus-operandi of most modern american companies. On the flip side, this makes me value honest people and companies all the more --- once I identify them. I will keep digging and questioning this issue until I am confident I have found and clearly understand the answers.

Altera_Forum · ‎10-15-2007

Hi Bootstrap -

I like your DirectX comment. You might like WYSIWYG cells. You set the function as a truth table and a few switches, hook up inputs, and it goes in one cell unless completely illegal. WYSIWYG is to assembly as primitives are to (strange compiler pragmas).

There are WYSIWYG CRCs of various common widths for Stratix II (wrong chip, but a good demo) at these links -

www.altera.com/literature/manual/cookbook.zip

www.altera.com/literature/manual/stx_cookbook.pdf

In the ZIP under crc/xor6.v is a little cell building block. The truth table is 6996... which you may recognize as a hex XOR pattern. crc_register.v has the "registered 2:1 mux" type building block you mentioned earlier. The tools will have no freedom to (e.g. move the synchronous clear AND gate out into the LUT). These CRC's are hard factored for minimum depth on 6 LUT parts. It sounds like you've already figured out how to do the pattern for minimum depth on 4's.

To port from Stratix II to C3 I think you can just change "stratixii_lcell_ff" to "cycloneiii...", delete the extra LUT inputs, and shorten the truth tables to 16 bit. Let me know if you want to go there. I can explain the nitty gritty in more detail.

Altera_Forum · ‎10-15-2007

About the schematic v.s. HDL topic :

I switched from schematic to Verilog and love it because ...

MODELSIM! A huge pain to learn. Essential once you get used to it.

Portability. Schematic is too vendor specific. Worry that it might stop working in a few years, etc.

You can generate Verilog easily with C / scripts, etc. It is clear that with Verilog entry and a C compiler in the background I can make any circuit I want and my hand won't fall off from clicking.

I do fiddle with schematics on paper while designing / optimizing / thinking. For what it's worth.