How to create a new schematic symbol in the Eagle editor

November 1, 2012, 7:41 pm

≫ Next: Obama on sorting 1M integers: Bubble sort the wrong way to go

≪ Previous: Spectral analysis with the Tektronix 5000 oscilloscope

This tutorial describes how to create a custom schematic symbol in the CadSoft Eagle editor. It assumes that you have some familiarity with Eagle and just want to create a schematic (not a PCB), but can't find a component you need.

Creating a part is surprisingly tricky - Eagle is one of those software packages with a GUI that looks more intuitive than it is. To create a component, there are three abstractions to deal with: Symbol, Package, and Device. The Symbol is the symbol as it appears in the schematic. It must be tied to a Package, which describes the shape of how the component is physically mounted on a circuit board, in particular the pads for the pins. Finally, the Device holds the complete description of a device, including the symbol and potentially multiple packages. Thus, even if you just want a schematic symbol, you must also deal with the package and device. The following image shows important parts of the Device screen.

Example: Zener diode

For example, suppose you want to create a Zener diode symbol by modifying one of the existing diode symbols.

First, find a part in an existing library you want to modify, e.g. use Edit > Add, then Search for something similar. Note the name of the library and the device, e.g. "diode" and "IN4004".

Next, create your custom library by going to the Control Panel and selecting File > New > Library, or open your existing library.

In Control Panel expand the library that has the component you want to copy and find the package you want. Right click and select "Copy to Library".

Next, open the new symbol in your library: Go to Library > Symbol and select the Symbol. Or from the Device screen right click on the crosshair in the middle of the symbol and select Edit Symbol.

Finally, you can edit the symbol using the standard Eagle editor functions. To make a Zener diode, I needed to change the grid resolution to 0.025, select the angular wire bend, and then use Wire to add the "fins" to make the symbol look like a Zener diode.

Go to File > Save As..., and save the new library.

To use this new library in your schematic, select Library > Use, and add your new library.

Improving the Symbol

The steps above are sufficient to edit and use a new component, but you may want to clean things up a bit.

To rename your symbol, go to Library > Rename, and enter a new name for the symbol.

To add a description, click on the Description link and enter a description using HTML. Typically it has <b>the title</b> <p>, and then a description.

To move the >NAME or >VALUE, use move and click on the crosshair at the lower left. You might want to temporarily reduce the grid size to get more control over the position.

Improving the Device

Go to Library > Device, and select the device.

To rename the device, use Library > Rename to rename the Device. You can also change the description as above, which is useful for searching.

To change the symbol's prefix (e.g. D1, D2, D3 for diodes), click on Prefix and enter the desired prefix.

You probably just want to have one package, so delete others by right clicking on them on the Device screen and selecting Delete.

To rename the package, go to Library > Package, and select the package. Then use Library > Rename to rename the Package. You can also change the description.

Adding new pins

If you want to add new pins to a symbol, things become more complicated, because you also need to add pads to the package and connect the pins to the pads, even if you don't care about PC boards. I recommend picking a starting symbol with the right number of pins if possible. But if not, use the following steps to add new pins.

Add the pins using Draw > Pin. You may need to rotate the pin. Right click the pin and select Properties to change the length or other property. You can enter a pin name or set Visible: Off if you don't want the pin name to show up.

To add pads, go to the Device, right click the Package, and select Edit Package. Add a Pad (anywhere) using the green circle icon.

Next you need to connect the pins and pads. Go to Library > Device and select your device. You will notice in the package pane an exclamation point in a circle. Underneath, click Connect, which will bring up the Connect panel. Select a Pin and a Pad and click Connect, until all pins are connected, and click Ok. Badk at the Device screen, you should now see a checkmark next to the Package. If you don't connect the pad to the pin, you will get "Error: Device ... has unconnected pin (G$1/P$1)!" when you try to add it to the schematic.

Creating an IC (or other component) from scratch

To create an IC, you can modify an existing IC device, a generic package from the ic-package library, or start from scratch. The existing devices are very function-specific and the ic-package symbols are kind of ugly, so you may end up needing to start from scratch. It's not too difficult and only takes a few minutes, but there are more steps than you might expect.

First, create a package with the right number of pins. Go to Library > Package, enter your package name (e.g. DIP8) after New, and click Ok. Using the green pad, drop 8 (for example) pads onto the package - the positions don't matter if you're not generating a PCB. Use Properties on each one (right click) to give the pads names 1, 2, etc.

Go to Library > Symbol, enter your symbol name after New (e.g. 555), and click Ok. (For this example, I'll create a 555 timer from scratch even though the library has one.) Put down all the pins for your IC, leaving plenty of room horizontally for the labels, rotating the pins as necessary. Under properties, give each pin the desired name and set length to short. Use four wires to create the outline. Put Text >NAME on the name layer (95) and Text >VALUE on the value layer (96). Add a description if you want.

Go to Library > Device, enter your device name after New (e.g. 555) and click Ok. Click on Add, select your new symbol, and add your symbol to the device. Under the package pane, click New, select your package, and click Ok. Click Connect, and carefully connect all the pins to the right pads, so the pin numbers show up okay. (Tip: do the pins in numerical order.) Click Ok. Add a Prefix and Description if you want.

Save your library with File > Save, and it should be ready to use.

Conclusion

CadSoft's Eagle PCB software is very useful for generating schematics, but the components you need are often missing. If you know the tricks, creating new symbols is not too hard, though. (If you want to create parts for a PCB, see Creating a new device in Eagle or Instructables or Sparkfun's tutorial.) I wrote this tutorial mainly for my own benefit, but I hope others find it useful too. Please leave a comment if you find errors or have additional suggestions.

↧

Obama on sorting 1M integers: Bubble sort the wrong way to go

November 3, 2012, 8:32 am

≫ Next: Teardown of the mysterious KMS 4-port USB charger

≪ Previous: How to create a new schematic symbol in the Eagle editor

Recently StackOverflow and Hacker News discussed the question of how to sort 1 million 8-digit numbers in 1 megabyte of RAM. This reminded me of when I saw Obama in 2007 at Google - Eric Schmidt asked Obama how to sort 1 million integers as a laugh line, but Obama shocked him by answering "I think the bubble sort would be the wrong way to go." The video is pretty entertaining:

Here's the transcript:

SCHMIDT: Now, Senator, you're here at Google and I like to think of the presidency as a job interview. Now, it's hard to get a job as president. And--I mean, you're going to do a great job. It's also hard to get a job at Google. We have questions and we ask our candidates questions. And this one is from Larry Schwimmer. What--you guys think I'm kidding, it's right here.[1] What is the most efficient way to sort a million 32-bit integers?
OBAMA: Well...
SCHMIDT: Maybe--I'm sorry...
OBAMA: No, no, no, no. I think--I think the bubble sort would be the wrong way to go.
SCHMIDT: [facepalm] Come on. Who told him this? Okay. I didn't see computer science in your background.
OBAMA: We've got our spies in there.[2]

If you're wondering about the complete answer to the sorting question, see a detailed explanation.[3]

All in all, it was a very unexpected answer from Obama with perfect delivery.

Notes

[1] Nervous laughter greeted the mention of a Larry Schwimmer question, because he once asked Jimmy Carter about his UFO encounter.

[2] Eric Schmidt had asked the sorting question to John McCain on a different visit, with the expected result. See the YouTube clip of McCain's visit.

[3] The pedantic may note that Obama's question is slightly different from the Stack Overflow question since it involves 32-bit integers vs. 8 digit integers, and the memory constraint was omitted.

↧

Teardown of the mysterious KMS 4-port USB charger

November 11, 2012, 11:26 am

≫ Next: JavaScript on the go: Programming from your phone

≪ Previous: Obama on sorting 1M integers: Bubble sort the wrong way to go

In this article I tear down a 4-port USB charger of puzzling origin. This charger is a huge step above the $2 counterfeit chargers I examined earlier in design and manufacture, but considerably below the quality of name-brand chargers. Likewise with safety - the charger was built with some attention to safety, but appears to fall short of UL standards.

One puzzle about this charger is it's unclear who makes it and what model it is. The case says it's the KMS AC-09 but the circuit board says "TC09-new-V4.2". Amazon lists the brand as "Cosmos®", but I couldn't find any sign that KMS or Cosmos are actual companies. After some web searches, I think the charger is built by Guangzhou Panyu Qiaonan Saidi Electronic Factory (more) as the TC09 charger for $5.30 wholesale, or maybe HK Yingjia International, a consumer electronics manufacturer in Shenzhen (more). In any case, I'll call this the "KMS charger" since I need to call it something.

In my previous lab analysis of 12 chargers, I compared a dozen different chargers in 9 different categories, rating them from 1 to 5 'bolts' and the KMS charger came in about average in terms of performance. The results for the KMS charger are summarized below. For details on these measurements, see my previous article A dozen USB chargers in the lab).

Overall rating
Vampire (idle) power usage
Efficiency under load
Achieves power rating
Spikes in output
High-frequency noise in output
Ripple in output
Voltage sag
Current sag
Regulation quality

The good and the bad

Overall, this charger is much higher quality than the $2 counterfeit chargers, but considerably lower quality than name-brand chargers.

The charger provides more filtering than basic chargers, from the large input choke to the multiple output inductors. It includes X and Y capacitors for filtering.

The charger looks mostly safe, although it doesn't have UL certification and I suspect it would fail certification. The 6mm clearance between the primary and secondary looks solid. However, the transformer windings are only separated by 3mm, rather than 6mm, as I show below. (This is still much superior to the $2 chargers that have almost no separation.)

One interesting feature of the power supply is the power plug can be interchanged for use in different countries. (Some other chargers such as the HP TouchPad and Apple iPad are similar.)

The charger has some quality issues. The power quality measurements I did in my previous article show the KMS charger has fairly poor quality output, with a lot of noise in the output.

The IC datasheet recommends 200 mm² of foil on the IC output pins to provide cooling. I measured about 18 mm² (less than 10% of recommended), which suggests the charger may overheat under full load.

The above photo shows that the build quality of the charger is not extremely high. The inductor at the front right is very crooked, and the optocoupler at the left is somewhat crooked. While this doesn't affect the performance, it shows the assembly was rapid rather than careful. More concerning, some of the solder joints appear to be almost bridged, which could cause catastrophic failure of the charger. I also found a government report of a KMS charger catching fire, apparently due to a loose wire in the power plug.

One unique feature of the charger is the blue LEDs which cause it to emit an eerie blue glow when in use. A lot of users dislike this though (according to reviews), because the light is distracting at night.

The circuit

Annotated schematic of the KMS TC-09 USB charger.

For readers interested in circuits, I have prepared the above approximate schematic (click for a larger view). The circuit is pretty straightforward compared to other chargers (look at my iPhone charger schematic for comparison). Starting at the upper left, the input AC is converted to DC by the diode bridge, and then filtered by a simple inductor-capacitor filter. This high-voltage DC is connected to the flyback transformer primary. The THX203H control IC switches the other side of the flyback transformer to ground through the current-sense resistors R12A and R12B and inductor L3. (Most chargers use a separate switching transistor, but in this charger, the transistor is inside the control IC.) The snubber circuit R2, C3, and D6 absorbs some of the high-frequency switching spikes (although looking at the output below, this circuit isn't entirely successful). The auxiliary transformer winding and D7 and C4 provide the DC power to the control IC. The optocoupler provides feedback to the IC, indicating the output voltage level.

On the secondary side, the high-speed Schottky diodes (D5) convert the transformer output to DC. This is then filtered through an inductor-capacitor filter that smooths it out. The output voltage feedback is generated by the TL431A regulator and fed into the optocoupler.[1]

Finally, the actual USB output circuitry has more components than you'd expect. For each pair of ports, four resistors set the D+ and D- voltages to indicate to devices that the charger is (pretending to be) an Apple 2A charger. Each port has a small bypass capacitor to smooth out power transients. Finally there are two blue LEDs with current-limiting resistors to provide the blue glow.

The controller IC poses a bit of a mystery. It's labeled as the THX 203H controller, which turns out to be manufactured by NanJing TongHuaXin Electronic Co, Ltd., a Chinese switching power supply chip company (details). The datasheet for this part is very hard to understand, as it is machine-translated from Chinese, for example:

The startup circuit inside IC is designed as a particular current inhalation way, so it can start up with the magnification function of the power switch tube itself.

After some more investigation, this chip seems to be the SDC603 Current Mode PWM Controller designed by SDC Semi (Shaoxing Devechip Microelectronics Co., Ltd.). This is a Chinese state-level R&D center that is part of China's Torch Plan Project to develop high-tech industries. (Also check out the SDC company song.)

The controller chip is a basic 8-pin current-mode PWM controller chip. It includes a built-in NPN power transistor, which reduces the charger part count. The chip can produce 12 watts output power.

Circuit board

The above picture shows the KMS charger circuit board on the left and a circuit board from the HP TouchPad charger on the right. Compact phone chargers such as the iPhone or TouchPad chargers go to amazing effort to pack the components as tightly as possible. The KMS charger on the other hand has a much more spacious design with a lot of wasted space. Since any charger with 4 USB ports is going to be fairly large, they probably figured it's not worth the effort to make the rest of the circuitry compact. The difference in density between the two circuit boards is striking, though.

A key safety feature of the KMS charger is visible in the middle of the circuit board - note the angular cut-out slot, and the empty vertical region with no circuitry. This isolates the high-voltage circuits on the right from the low-voltage output circuits on the left. The KMS charger has a safe 6mm gap and the cut-out provides additional creepage distance. Counterfeit chargers usually skip this critical safety feature, with only a millimeter or two keeping the high voltage from reaching the output and shocking the user.

You might wonder how the charger works if the high voltage and low voltage circuits are separated by a gap. The key is that any components that cross this gap must be specially designed to avoid electrical hazards. The key component is the flyback transformer, which transfers the power through magnetic fields, avoiding any direct electrical connection between the two sides. The feedback signal passes from the secondary to the primary through an optocoupler, which transmits the feedback through a light signal, again avoiding an electrical connection. Finally, a Y safety capacitor connects the primary and secondary grounds to reduce electrical noise. The design of a Y capacitor ensures it won't pass dangerous electrical currents, and won't short out even under fault conditions.

Transformer teardown

The flyback transformer is the key component of a charger and usually the largest and most expensive. The transformer is where the high input voltage is converted to the output voltage, and the two voltages are in extremely close proximity, so the safety of the transformer is critical. From the outside, you can't tell if the manufacturer saved a few cents by leaving out most of the insulation, as happens with $2 chargers. I tore apart the transformer of the KMS charger to see what's inside.

The black circle on top of the transformer seen earlier is simply a foam disk, which helps reduce transformer noise by padding the transformer against the case. If a charger makes a high-pitched noise, it's usually coming from the transformer. Power supplies are usually designed with switching frequencies higher than people can hear, but in some circumstances it's still audible, especially if you are young and haven't lost high frequency hearing.

Under the first layers of insulating tape is a copper 'belly band' which surrounds the transformer to provide noise shielding from eddy currents in the transformer.[2] This copper shielding is omitted from super-cheap transformers, showing that this charger goes beyond the minimum.

The windings are all separated by insulating tape. Under the belly band and insulating tape is the auxiliary winding, which provides power to the control IC. You might wonder why the IC needs a separate power supply instead of using the USB power output, but this wouldn't be safe because the USB output would no longer be isolated from the input. This winding is 9 turns of wire; since the IC requires low current, the wire is fairly thin.

Above you can see half of the primary winding, which is fed by the input power. This winding has 40 turns of wire.

An interesting safety feature is the 3 mm "margin tape"[3] to the lower right of the winding, which ensures that the primary winding stays 3 mm away from the edge. I was interested to see this, since other transformers I've disassembled use triple-insulated wire instead of boundary tape. To ensure safe electrical isolation between the primary and secondary windings, either the secondary wires need to be triple-insulated, or there needs to be at least 6mm of distance between the windings. Super-small chargers don't have 3mm of extra room, so they use the more expensive triple-insulated wire. But since the KMS is larger, it uses the 3mm margin tape. I'm not an expert on safety requirements, but it looks like this transformer doesn't quite meet the requirements. Normally, the margin tape is put on both sides, so there's a total of 6mm creepage distance between the windings.[4][5] But since the tape is only on one side, the windings only have half of the required distance.

The secondary winding provides the low-voltage high-current output with 8 turns of wire. In order to support 2 amps, this winding has thick wire with four strands in parallel. I haven't seen parallel strands like this before, probably because the KMS charger supplies higher power. Note the 3mm margin tape keeping the winding away from the edge.

Finally, the second half of the primary winding forms the innermost layer of the transformer; this is also 40 turns of wire. The primary winding is split into two layers that surround the secondary winding for better electrical properties. Note that the primary winding is 80 turns, while the secondary output winding is 8 turns. To oversimplify a bit, this means the output will be 10 times the current of the input at 1/10 the voltage, which is how the high voltage low current input results in the low voltage high current output. The above picture gives a good view of the 3mm margin tape at the right that keeps the wire away from the edge of the core.

Measuring the charger in use

The charger is a switching power supply using a flyback transformer. How this works is the high voltage DC is switched on and off tens of thousands of times a second by the control IC. These pulses of DC are sent into the flyback transformer. A flyback transformer is different from normal transformers in that the output diode blocks power from flowing out of the transformer while power is flowing in. Instead, as the current increases, power is stored in the transformer as a magnetic field. When the input current switches off, the stored power then flows out of the transformer, providing the desired output.

By looking at the output voltage and frequency spectrum, we can determine a fair bit about how the device operates. I measured a constant 60 kHz switching frequency above 1 amp output load, but a dropping frequency for lower loads. The datasheet gives some clues to this behavior. The power supply normally operates using PWM (pulse width modulation). The switching frequency is constant, but the amount of time the power transistor is on varies. The longer it is on, the more power into the transformer and the more output power. This matches the observed behavior from 1 amp to 3.5 amps. The datasheet also describes how the switching frequency drops under low power, which matches what I observed below 1 amp.

The above oscilloscope trace illustrates the behavior when producing 2 amps. The frequency spectrum shows narrow peaks (orange) at the 60 kHz switching frequency and harmonics. The yellow output voltage shows a bunch of large spikes due to the power switching on and off - this indicates that the charger isn't filtering the output very well, letting these spikes get into the connected device.

The diagram below zooms in to show the output in more detail. Each spike is when the switching transistor turns on at 60 kHz. The output power drops as the current through the flyback transformer increases (since the transformer secondary is blocked by the diode at this time). The output then climbs when the transistor switches off and the power is transferred to the secondary.

As the charger load increases above 3 amps, the quality of the output significantly decreases, and large 120 Hz ripple appears in the output (yellow). This is probably because the input capacitors can't store enough power to provide a constant output at this high load. Since the charger is only rated to provide 2.1 amps of output, I don't consider this a design flaw, but it's interesting to see this behavior in the output. The key result here is not to overload the charger, because the power quality gets much worse.

The charger is designed to reduce the switching frequency under low load for efficiency. I found this feature kicks in at loads under 1 amp, with the switching frequency smoothly dropping from 60 kHz to 29 kHz at 250 mA load and even lower under no load. The graph below shows the frequency spectrum at 250 mA load. Note that the spikes are wider than the previous case since the frequency becomes more unstable when it is reduced.

The output waveform below at 250 mA is similar to the previous (2A) case, except at a lower frequency. Note that the output still has large spikes when the transistor switches on. The output voltage drops while the switching transistor is on and then rises while the transistor is off (due to the flyback design), so you can see below that the transistor is off most of the time at low power.

Power consumption

Measuring the power consumption of a charger is tricky because the charger doesn't use power like a normal resistive load, but uses a nonlinear part of the input current. This results in a power factor lower than unity. (You might expect that the poor power factor is because the charger switches on and off thousands of times a second, but actually it's the fault of the diode bridge.) I measured the power consumption of the charger under load by measuring the instantaneous line voltage and current, computing the instantaneous power, and then computing the real power from this.[6] In the following diagrams, the input line voltage is shown in yellow, and the input current is in cyan. The instantaneous power is graphed in orange at the bottom - simply the product of the voltage and current.[7]

The oscilloscope output below shows the power usage of the charger under no load. The line input voltage (yellow) is a nice sine wave, but the current (cyan) is very irregular. There is a bump corresponding to the voltage peaks as the input diodes conduct and re-charge the filter capacitors. The remaining current oscillations are unusual - I haven't seen them in other chargers, and I expect they are due to the large input choke. From the orange line you can see that the power usage has small spikes at 120 Hz. Taking the power factor into account and computing real power shows the charger uses 180 mW when idle which is fairly high, but actually lower than the Apple iPhone charger.

With load applied to the charger, the power usage shoots up as shown below. I compute the power usage as 6.4 watts, while the charger is supplying 4.4 watts to the output, for an efficiency of 69%. The shape of the current curve (cyan) and power curve (orange) shows that the charger is taking line power about half the time (the big curved peaks), and not for the other half (the flat oscillations in between). This illustrates the bad power factor that switching power supplies have. (PC power supplies often use power factor correction (PFC) circuits to improve the power factor.)The yellow input voltage curve is somewhat distorted, probably due to the lame isolation transformer I used.

You might wonder what happens if you short-circuit the output of the charger. It is designed to shut down before damage occurs, rather than self-destruct. After the internal voltage drops, the charger will start up again, and repeat this cycle until the problem goes away. This is called "hiccup mode", since the charger generates hiccups of power. The oscilloscope trace below shows the power consumption of the KMS charger when shorted. Note the pulses as it start up and shuts down every 250 milliseconds.

Components

For those who are interested in the components, I have some details. The two 6.8uF 400V electrolytic capacitors in the primary are made by ChengX. The two 470uF capacitors in the secondary are made by JWCO. The X capacitor is a .1uF K 275V X2 made by Dain Electronics, a Chinese manufacturer of plastic metal film capacitors, now merged with WINDAY Electronic Industrial Co Ltd. The Y1 capacitor is a JN222M 2200pF disk ceramic suppression capacitor manufactured by Jya-Nay, a Taiwanese capacitor company. There's also a blue 681J (i.e. .68nF) polyester film capacitor of unknown manufacturer; looking at the circuit board this capacitor (C7) was originally a surface-mounted device, but was replaced with a larger capacitor.

The diodes are manufactured by MIC (Master Instrument Corporation, Shanghai). Most chargers use a diode bridge to convert the AC to DC, but this charger uses four independent diodes, which are 1N4007 700V diodes. The secondary rectification uses two Schottky diodes (SR360 3 amp 60V) from MIC. The circuit board uses the unusual mounting of two diodes on top of each other soldered into the same holes. The charger also uses FR107 700V fast recovery diodes.

Like most power supplies, the charger uses a TL431A for the voltage feedback.[1] This TL431A is produced by Wing Shing Computer Components The optocoupler is an ORPC 817B optocoupler from Shenzen Orient Technology Co., Ltd. (I don't want to speculate on the cultural significance of their raising the flag over Iwo Jima company logo.)

Conclusion

The KMS charger occupies an interesting middle ground between dangerous $2 counterfeit chargers and expensive name-brand chargers. Tearing down this 4-port USB charger of unknown origin reveals details of the circuitry. It also illustrates a network of Chinese suppliers and manufacturers, most of which are hardly known in the US. On Amazon

, customer ratings for this charger are split between people who love it and people who hate it, which seems reasonable given what I saw in the teardown. Thanks to Gary F. for providing the charger.

Notes and references

[1] To summarize the feedback circuit: R17 and R18 form a resistor divider on the output voltage. If the output voltage is above 5.125 volts, the TL431 control input will be above 2.5 and the TL431 conducts. This energizes the optocoupler, providing current pulling the FB pin lower. Low FB increases the duty cycle, increasing the maximum transformer current, and increasing the output voltage. If the output voltage is considerably too high, or overtemperature is sensed, the switching frequency is decreased, reducing the power transferred to the output. (This is over-simplified; the frequency response of the feedback control loop is controlled via R13, R16, C8, and C9.) An alternative is to sense voltage from the primary side, so the feedback circuit can be eliminated. This reduces the total charger cost by about 20 cents according to a report.

[2] The use of a copper "belly band" in flyback transformers is discussed in Flyback Transformer Design for the UCC28600 (page 2). It provides an electromagnetic radiation shield. The article mentions that the belly band may cause difficulties with creepage requirements and that seems to be the case with the KMS, since there is only 3mm creepage between the primary-grounded belly band and the secondary wiring.

[3] A lot of interesting information about flyback transformer design and construction is in Cookbook for do-it-yourself transformer design

[4] A discussion of how to achieve 5-6mm creepage distance by using 2.5 or 3mm margin tape is in Flyback Transformer Design for the IRIS40xx Series. Note that the margin tape must be on both sides of the winding to achieve this distance, while the KMS transformer only uses the tape on one side.

[5] Safety Considerations in Power Supply Design provides a detailed explanation of safety requirements for power supplies. It explains creepage and clearance

[6] See Understanding power factor and input current harmonics in switched mode power supplies for details on power factor, power supplies have poor power factors, and why poor power factors are a bad thing. Briefly, the power factor is due to the non-linear current through the diodes at peaks, not due to a phase shift. Real power can be measured with an oscilloscope as the average value of the instantaneous power, see Power - Real And Apparent: A Tutorial On Basic Line Power Measurements or Measuring power using the DL750.

[7] For the input power measurements it is very important to use an isolation transformer to avoid destroying your oscilloscope or shocking yourself. For my measurements, a resistor voltage divider reduced the input line voltage - the actual voltage is 11.06 times the displayed probe 1 voltage (C1, yellow). The current was measured through a 5.2 ohm shunt resistor, so the current is 1/5.2 times the displayed probe 2 voltage (C2, cyan). Combining these, the power in watts is 2.13 times the measured C1*C2 value (M1, orange).

↧

JavaScript on the go: Programming from your phone

November 24, 2012, 11:37 am

≫ Next: The 6502 overflow flag explained mathematically

≪ Previous: Teardown of the mysterious KMS 4-port USB charger

Have you ever wanted to write a program when the only computer available is your phone? You can use an Android phone to write and run JavaScript programs by using a few simple tricks.

While traveling over Thanksgiving I was thinking about how the 6502 microprocessor works and wanted to analyze some Boolean logic circuits. A trivial programming task but the only computer I had was my phone.

I searched for programming languages available on Android. Python for Android looked way too complex. The Clojure REPL intrigued me but I didn't want to learn Clojure right now. Other languages seemed limited or buggy. Then I was struck by the obvious choice for a powerful and fully-supported language with graphics capabilities: JavaScript. I could run JavaScript programs in the browser if I had a way to enter them.

I downloaded DroidEdit Pro which gave me a fullscreen editor for files on my phone. Typing HTML on the phone was painful until I downloaded the Hacker's Keyboard, which makes it much easier to type special characters. The picture below shows these tools in use.

My development cycle is:

Edit the code in DroidEdit and save it to a local .html file.
Select 'Preview in Browser' from DroidEdit and test the program.
Upload the file to my web server using DroidEdit's SFTP support when ready.

For debugging, the trick is to use the default browser, not Chrome. Enter about:debug in the URL bar to open the JavaScript console, which is vital for debugging.

Obviously this environment isn't as powerful as a full-size keyboard and monitor and powerful editor, but it lets me program no matter where I am. I haven't got the hang of cut-and-paste in the editor, but shift-arrow seems to work better than tapping.

Here's my program in action. It wont get any style points - I rapidly lost my enthusiasm for whitespace with the tiny keyboard - but it got the job done.

I also used this development environment to show my nephew how to make web pages with HTML. He thought it was very cool that he could type HTML into the phone, hit Control-S to save, and immediately load the web page on his iPad. He's now busily learning HTML and building his own web pages.

I hope these tips help you program while on the road. Leave a comment if you have tips of your own.

↧

The 6502 overflow flag explained mathematically

December 21, 2012, 11:12 pm

≫ Next: The 6502 CPU's overflow flag explained at the silicon level

≪ Previous: JavaScript on the go: Programming from your phone

The overflow flag on the 6502 processor is a source of myth and confusion. In this article, I explain signed and unsigned binary arithmetic, discuss the meaning of the overflow flag, show various formulas for computing overflow, and dispell some myths about the overflow flag.

The 6502 is an 8-bit microprocessor that was very popular in the 1970s and 1980s, powering popular home computers such as the Apple II, Commodore PET, and Atari 400/800. The 6502 instruction set includes 8-bit addition and subtraction operations. Various status flags (carry, zero, negative, overflow) are set based on the result of the operation. Most of the flags (carry, zero, negative) are straightforward, but the meaning of the overflow (V) flag is harder to understand. If the result of a signed add or subtract won't fit into 8 bits, the overflow flag is set. (The overflag is affected in a couple other cases - the BIT operation, and the SO pin on the chip. These are discussed in detail in the excellent article The overflow flag explained, so I won't discuss them here.)

Addition on the 6502

The 6502 has an 8-bit addition operation ADC (add with carry) which adds two numbers and a carry-in bit, yielding an 8-bit result and a carry out bit. The following diagram shows an example addition in binary, decimal, and hexadecimal.

Unsigned binary addition of 80 + 44 yielding 224.

The carry flag is used as the carry-in for the operation, and the resulting carry-out value is stored in the carry flag. The carry flag can be used to chain together multiple ADC operations to perform multi-byte addition.

Ones-complement and twos-complement

The concepts of ones-complement and twos-complement are important to understand signed arithmetic. The ones complement of a number simply flips all 8 bits in the number. That is, the ones complement of N is 255-N. This is very easy to do in hardware.

The twos complement of a number is the ones complement of the number plus 1. That is, the twos complement of N is 256-N. Thw twos complement is very useful because adding M and the twos complement of N is the same as subtracting N from M. For example, to compute 80 - 112, simply take the twos complement of 112 (binary 10010000) and add it to 80 (binary 01010000), yielding (binary 11100000). This result is the twos complement of 32, indicating -32.

Signed binary addition of 80 and -112 yielding -32.

Note that 80+144 and 80-112 had exactly the same bit-level operations - only the interpretation of the bits was different. This is why twos complement numbers are so useful - the same addition circuit works with them.

To see why twos complement numbers work this way, consider M + (-N) or M - N

M - N
→ M - N + 256	Adding 256 doesn't change the 8-bit value.
= M + (256 - N)	Simple algebra.
= M + twos complement of N	Definition of twos complement.

Thus, adding the twos complement is the same as subtracting. (With the exception of the carry bit, which is affected by the extra 256. This will be discussed later)

Twos-complement signed numbers

Twos complement numbers are very useful for representing signed numbers, since a number between -128 and +127 can fit into one byte: the top bit is 0 for a normal non-negative number (0 to 127), and the top bit is 1 for a twos-complement negative number (-1 to -128). (The value of the top bit is reflected in the N (negative) status flag.)

The nice thing about signed numbers is that regular binary arithmetic yields the expected results (in most cases). That is, the processor adds or subtracts the numbers as if they are unsigned binary numbers, and the right answer occurs just by interpreting them as signed.

Another example shows that the carry is ignored with signed addition. In this case, 80 and -48 are added, yielding 32. Since 80 + (256-48) = 256 + (80-48), the "extra" 256 ends up in the carry bit.

Signed addition of 80 and -48 yields a carry, which is discarded.

Unfortunately, problems can happen. For instance, 80 + 80 = 160 with unsigned arithmetic, but with signed arithmetic the result is unexpectedly -96. The problem is that 160 will fit into a byte as an unsigned number, but it is too big to store in a byte as a signed number. Since the top bit is set, it is interpreted as a negative number. To indicate this problem, the 6502 sets the overflow flag.

Signed addition of 80 + 80 yields overflow.

The table that explains everything about overflow

The definition of the 6502 overflow flag is that it is set if the result of a signed addition or subtraction doesn't fit into a signed byte. That is, overflow occurs if the result is > 127 or < -128. The symptom of this is adding two positive numbers and getting a negative result or adding two negative numbers and getting a positive result.

This section explores all the possible ways that overflow can occur. The following examples consider the addition of two signed numbers M and N. It is only necessary to consider the top bits of the numbers and the carry from bit 6, as shown in the diagram below, since the lower bits don't affect overflow (except by causing a carry from bit 6).

Binary addition, demonstrating the bits that affect the 6502 overflow flag.

There are 8 possibilities for these bits, as expressed in the table below. For each set of input bits, the table shows the carry out (C₇), the top bit of the sum (S₇), which is the sign bit, and the overflow bit V. This covers the 4 possibilities for sign of the arguments (positive + positive, positive + negative, negative + positive, negative + negative), with and without carry from bit 6. The table shows an example sum for each line, first expressed in hexadecimal, and then interpreted as unsigned addition and signed addition.

Inputs			Outputs				Example
M₇	N₇	C₆	C₇	S₇	V	Carry / Overflow	Hex	Unsigned	Signed
0	0	0	0	0	0	No unsigned carry or signed overflow	0x50+0x10=0x60	80+16=96	80+16=96
0	0	1	0	1	1	No unsigned carry but signed overflow	0x50+0x50=0xa0	80+80=160	80+80=-96
0	1	0	0	1	0	No unsigned carry or signed overflow	0x50+0x90=0xe0	80+144=224	80+-112=-32
0	1	1	1	0	0	Unsigned carry, but no signed overflow	0x50+0xd0=0x120	80+208=288	80+-48=32
1	0	0	0	1	0	No unsigned carry or signed overflow	0xd0+0x10=0xe0	208+16=224	-48+16=-32
1	0	1	1	0	0	Unsigned carry but no unsigned overflow	0xd0+0x50=0x120	208+80=288	-48+80=32
1	1	0	1	0	1	Unsigned carry and signed overflow	0xd0+0x90=0x160	208+144=352	-48+-112=96
1	1	1	1	1	0	Unsigned carry, but no signed overflow	0xd0+0xd0=0x1a0	208+208=416	-48+-48=-96

A few interesting things can be noted from this table. Signed overflow (V=1) happens in two of the eight cases - when the result of adding two positive numbers overflows and ends up negative, and when the result of adding two negative numbers overflows and ends up positive. These rows are highlighted. Signed overflow will never happen when adding a positive number and a negative number, since the result will have a smaller magnitude. Unsigned carry (red in the unsigned column) happens in four of the eight cases, and is independent of signed overflow.

Formulas for the overflow flag

There are several different formulas that can be used to compute the overflow bit. By checking the eight cases in the above table, these formulas can easily be verified.

A common definition of overflow is V = C₆ xor C₇. That is, overflow happens if the carry into bit 7 is different from the carry out.

A second formula simply expresses the two lines that cause overflow: if the sign bits (M₇ and N₇) are 0 and the carry in is 1, or the sign bits are 1 and the carry in is 0:
V = (!M₇&!N₇&C₆) | (M₇&N₇&!C₆)

The above formula can be manipulated with De Morgan's laws to yield the formula that is actually implemented in the 6502 hardware:
V = not (((m₇ nor n₇) and c₆) nor ((M₇ nand N₇) nor c₆))

Overflow can be computed simply in C++ from the inputs and the result. Overflow occurs if (M^result)&(N^result)&0x80 is nonzero. That is, if the sign of both inputs is different from the sign of the result. (Anding with 0x80 extracts just the sign bit from the result.) Another C++ formula is !((M^N) & 0x80) && ((M^result) & 0x80). This means there is overflow if the inputs do not have different signs and the input sign is different from the output sign (link).

Subtraction on the 6502

The behavior of the overflow flag is fundamentally the same for subtraction, indicating that the result doesn't fit into the signed byte range -128 to 127. The 6502 has a SBC operation (subtract with carry) that subtracts two numbers and also subtracts the borrow bit. If the (unsigned) operation results in a borrow (is negative), then the borrow bit is set. However, there is no explicit borrow flag - instead the complement of the carry flag is used. If the carry flag is 1, then borrow is 0, and if the carry flag is 0, then borrow is 1. This behavior may seem backwards, but note that both for addition and subtraction, if the carry flag is set, the output is one more than if the carry flag is clear.

Defining the borrow bit in this way makes the hardware implementation simple. SBC simply takes the ones complement of the second value and then performs an ADC. To see how this works, consider M minus N minus borrow B.

M - N - B	SBC of M and N with borrow B
→ M - N - B + 256	Add 256, which doesn't change the 8-bit value.
= M - N - (1-C) + 256	Replace B with the inverted carry flag.
= M + (255-N) + C	Simple algebra.
= M + (ones complement of N) + C	255 - N is the same as flipping the bits.

The following table shows the overflow cases for subtraction. It is similar to the previous table, with the addition of the B column that indicates if a borrow resulted. Unsigned operation resulting in borrow are shown in red, as are signed operations that result in an overflow.

Inputs			Outputs					Example
M₇	N₇	C₆	C₇	B	S₇	V	Borrow / Overflow	Hex	Unsigned	Signed
0	1	0	0	1	0	0	Unsigned borrow but no signed overflow	0x50-0xf0=0x60	80-240=96	80--16=96
0	1	1	0	1	1	1	Unsigned borrow and signed overflow	0x50-0xb0=0xa0	80-176=160	80--80=-96
0	0	0	0	1	1	0	Unsigned borrow but no signed overflow	0x50-0x70=0xe0	80-112=224	80-112=-32
0	0	1	1	0	0	0	No unsigned borrow or signed overflow	0x50-0x30=0x120	80-48=32	80-48=32
1	1	0	0	1	1	0	Unsigned borrow but no signed overflow	0xd0-0xf0=0xe0	208-240=224	-48--16=-32
1	1	1	1	0	0	0	No unsigned borrow or unsigned overflow	0xd0-0xb0=0x120	208-176=32	-48--80=32
1	0	0	1	0	0	1	No unsigned borrow but signed overflow	0xd0-0x70=0x160	208-112=96	-48-112=96
1	0	1	1	0	1	0	No unsigned borrow or signed overflow	0xd0-0x30=0x1a0	208-48=160	-48-48=-96

Comparing the above table with the overflow table for addition shows the tables are structurally similar if you take the ones-complement of N into account. As with addition, two of the rows result in overflow. However, some things are reversed compared with addition. Overflow can only occur when subtracting a positive number from a negative number or vice versa. Subtracting positive from positive or negative from negative is guaranteed not to overflow.

The formulas for overflow during addition given earlier all work for subtraction, as long as the second argument (N) is ones-complemented. Since internall subtraction is just addition of the ones-complement, N can simply be replaced by 255-N in the formulas.

Overflow myths

There are a lot of myths and confusion about the overflow flag. Since the flag is a bit difficult to understand, simple but wrong explanations are easy to find.

The most common myth is that just as the carry bit indicates a carry (or overflow) from bit 7, the overflow bit indicates a carry (or overflow) from bit 6 (example, example, example). As can be seen from the table above, sometimes a carry from bit 6 causes an overflow and sometimes it doesn't.

Another myth is that for multi-byte signed numbers, you use the overflow flag instead of the carry flag to carry from one byte to another (example). In fact, carry is still used to add/subtract multi-byte signed numbers, the same as with unsigned numbers.

It is sometimes claimed that the overflow bit is set if a result is too large to be represented in a byte (example, example). This omits the critical word signed - a signed result can be too large to fit in a byte, even if the unsigned result fits, and vice versa. Examples are in the table above.

Another confusing explanation is that the overflow flag is set when the sign bit is affected (example). The table shows that sometimes there is overflow when the sign bit is affected by bit 6 carry, and sometimes there is overflow when the sign bit is not affected.

Conclusions

This is probably more than anyone really wants to know about the overflow flag. In my next article, I will discuss how this is implemented at the silicon level.

↧

The 6502 CPU's overflow flag explained at the silicon level

January 12, 2013, 12:10 am

≫ Next: Notes on the PLA on the 8085 chip

≪ Previous: The 6502 overflow flag explained mathematically

In this article, I show how overflow is computed in the 6502 microprocessor at the transistor and silicon level. I've discussed the mathematics of the 6502 overflow flag earlier and thought it would be interesting to look at the actual chip-level implementation. Even though the overflow flag is a slightly obscure feature, its circuit is simple enough that it can be explained at the silicon level.

The 6502 microprocessor chip

The 6502 is an 8-bit microprocessor that was very popular in the 1970s and 1980s, powering popular home computers such as the Apple II, Commodore PET, and Atari 400/800. The following photograph shows the die of a 6502 processor. Looking at the photograph, it seems impossibly complex, but it turns out that it actually can be understood, using the Visual 6502 group's reverse engineered 6502. The red box shows that part of the chip that will be explained in this article. The 6502 chip is made up of 4528 transistors (3510 enhancement transistors and 1018 depletion pullup transistors). (By comparison, a modern Xeon processor has over 2.5 billion transistors, which would be almost hopeless to try to understand.)

Photomicrograph of the 6502, from Visual 6502 (CC BY-NC-SA 3.0). The following diagrams zoom in on the red box, where the overflow circuit is located.

As a rough overview of the above photograph, the edge of the die shows the wires going to the pins. Approximately top fifth of the chip (with the regular rectangular pattern) is the PLA that decodes instructions. The middle third is a bunch of logic, mostly to do additional decoding of instructions. The bottom half has the registers, ALU (arithmetic-logic unit), and main busses. They are all 8 bits, with each bit in a horizontal layer. The high-order bit is at the bottom of the photo, and this is where the overflow logic lies.

The overflow formula

In brief, if an unsigned addition doesn't fit in a byte, the carry flag is set. But if a signed addition doesn't fit in a byte, the overflow flag is set. The 6502 processor computes the overflow bit for addition from the top bits of the two operands (A₇ and B₇), and the carry out of bit 6 into bit 7 (C₆):

V = not (((A₇ NOR B₇) and C₆) NOR ((A₇ NAND B₇) NOR C₆))

For a more detailed explanation of what overflow means, see my previous article or The overflow flag explained.

Gate-level implementation

The overflow computation circuit in the 6502 microprocessor.

Described as gates, the actual circuit to generate the overflow flag in the 6502 turns out to be surprisingly simple. It uses the carry out of bit 6, and the top bits of the two arguments A and B. Since the values of NAND(a7, b7) and NOR(a7, b7) are already available in the ALU (Arithmetic-Logic Unit) for other purposes, the actual overflow circuit is simply the three gates on the right. (The ALU is, of course, much more complex than the part shown above.) This circuit can be seen at the bottom of the 6507 schematic (where the inverted overflow value is called FLOW). You might wonder why the circuit uses NAND and NOR gates so heavily; it turns out that these are much easier to implement with transistors than AND and OR gates.

Transistor-level implementation

The transistors that implement the overflow circuit in the 6502 microprocessor. The circuits on the left compute the NAND and NOR of the top bits of A and B. The circuit on the right computes the overflow flag. Based on the remarkable transistor-level schematic of the full 6502 chip, reverse-engineered by Balazs.

The circuit above shows the actual implementation of the overflow circuit in the 6502 using NMOS transistors. The circuit to generate the overflow flag is very simple, requiring just a few transistors to implement the three gates. A, B, and carry are the inputs, and the output #overflow indicates complement of the overflow signal.

MOS transistors are fairly easy to understand, since they operate like switches. Most of the transistors are NMOS enhancement mode transistors, which can be considered as switches that close if the gate has a positive input, and are open otherwise. The transistors with a black bar are NMOS depletion mode transistors, which can be considered as pull-up resistors, giving a positive output if nothing else pulls the output low.

The three transistors on the left implement a simple logic gate to compute NAND of A and B. If both inputs A and B are positive, the switches close and connect the output to ground (the horizontal line at the bottom). Otherwise, the pullup transistor connects the output to the positive voltage (circle at the top). Thus, the output is the NAND of A and B - 0 if both inputs are positive, and 1 otherwise.

The next three transistors compute NOR of A and B. If A, B, or both are positive, the associated transistor is switched on and connects the output to ground. Otherwise the output is positive.

The remaining transistors are the actual overflow circuit. The next group of three transistors is a NOR gate, which was described above. It computes the NOR of the carry and the NAND output from the ALU, feeding its output into the final group of four transistors. The four transistors on the right implement an AND gate and NOR gate in a single circuit. If the output from the previous circuit is 1, the rightmost transistor switches on, pulling the output (inverted V) to ground. If both NOR7 and CARRY6 are 1, the two associated transistors switch on, pulling the output to ground. Otherwise, the pullup transistor keeps the output high. The result is the complemented overflow value.

Going to the silicon

Now that you've seen how the circuit works at the transistor level, the silicon level can be explained.

We'll begin with an (oversimplified) description of how the chip is constructed. The chip starts with the silicon wafer. Regions are diffused with an element such as boron, yielding conductive n⁺ diffusion regions. On top of the polysilicon layer is a layer of metal "wires" providing more connections. For our purposes, diffusion regions, polysilicon, and metal can all be consider conductors. In the 6502, the polysilicon connections run roughly vertical, and the metal wires run generally horizontal.

Structure of an NMOS transistor. The n⁺ diffusion regions (yellow) separated by undiffused silicon (gray). The gate is formed by an insulating oxide layer (red) with a diffusion line (purple) over it.

To build a transistor, two n⁺ regions are separated by an undiffused region. A thin insulating oxide layer on top forms the transistor gate, which is wired to a diffusion line. When charge is applied to the gate via the polysilicon line, the two n⁺ regions can conduct.

The follow picture zooms in on the base silicon layer in the 6502, showing the region in the red outline. The darker gray regions are n⁺ diffusion areas, which have been doped to be conducting. The white stripes that separate n⁺ regions are the transistor gates, showing the thin insulating oxide layer that switches on and off conduction between the neighboring n⁺ regions. The gray squares are vias, which connect to other layers.

The diffusion layer of the 6502, zoomed in on the overflow circuit. The shaded regions are diffusion regions, and the unshaded regions are undiffused silicon. The white strips show transistor gates. From Visual 6502 (CC BY-NC-SA 3.0).

The next picture shows the polysilicon and metal layers that lie on top of the base silicon. This picture is aligned with the previous one, and you may be able to pick out some of the diffusion layer underneath. The whitish vertical stripes are conductive polysilicon. The greenish metallic-looking horizontal stripes are in fact metal, forming conductors. The gray square are vias, which connect different layers. Note that the chip is crammed full of conductors, making it hard at first glance to tell what is going on.

Closeup of the 6502 microprocessor die, showing the overflow circuit. From Visual 6502 (CC BY-NC-SA 3.0).

The following picture shows approximately how the transistor-level circuit maps onto the silicon. This circuit is the same as the transistor schematic earlier, just drawn to match the actual layout on the chip. The A, B, and CARRY inputs come from other parts of the chip, and the inverted #OVERFLOW output exits on the right to other destinations.

The final picture explains exactly what is happening at the silicon level. It labels the different layers that take part in the overflow circuit with different colors. The lowest layer is the diffusion layer in yellow. On top of this is the polysilicon layer in purple. The topmost layer of metal is in green. Power (Vcc) and ground are supplied through the metal layer. The crosshatches show transistor gates, formed by polysilicon over insulating oxide. The skinny crosshatched areas are the enhancement transistors used as switches. The blocky crosshatched areas connected to Vcc (positive voltage) are the depletion transistors used as pullups.

The circuit can be understood starting in the upper left. A and B are bit 7 of the A and B values going into the ALU. (A and B come from elsewhere in the processor.) If A and B are positive, the two upper transistors (vertical crosshatches) will pull the NAND output low. If A or B are positive, one of the two transistors below will pull the NOR output low. The NAND and NOR outputs travel to multiple parts of the ALU through metal, polysilicon, and diffusion "wires", but only the relevant connections are shown.

In the lower left is the first gate of the overflow circuit, computing the NOR of the NAND output and carry (which comes from elsewhere in the chip). The polysilicon line (purple) on the bottom is the output from this gate. In the lower right is the second gate of the overflow circuit, combining the NOR, carry, and output of the first gate. The result is #overflow (i.e. inverted overflow).

You can see this circuit in action in the Visual 6502 simulator. The color scheme in the simulator is different - diffusion is green, yellow, orange, and red. The metal layer is shown in ghosted white, but Vcc and ground are omitted. Polysilicon is in purple, and the transistors are not explicitly shown.

Conclusions

By focusing on a simple circuit, the 6502 microprocessor chip can actually be understood at the silicon level. It's interesting to see how the complex patterns etched on the chip can be mapped onto gates, and their function understood.

More comments on this article are at Hacker News. Thanks for visiting!

↧

Notes on the PLA on the 8085 chip

January 13, 2013, 3:36 pm

≫ Next: Inside the ALU of the 8085 microprocessor

≪ Previous: The 6502 CPU's overflow flag explained at the silicon level

The 8085 processor uses a PLA (programmable logic array) to control much of the activity within the processor, such as instruction decoding and controlling the data flow between components of the chip. Pavel Zima has reverse-engineered the transistor-level circuitry of the 8085 microprocessor. I've looked into this in a bit more to figure out the architecture of the Programmable Logic Array, which takes up a large fraction of the chip. The PLA circuit is much more complex than the PLA on the 6502, for instance. It turns out that Pavel is ahead of me with information on the decode and timing PLAs, but the information below may still be of interest.

The following diagram shows the arrangement of the PLA on the chip (image from Visual 6502). The PLA has 5 planes, which I have labeled A through G.

The block diagram below shows approximately how the planes are connected. Plane A receives inputs from the instruction circuit. Its outputs are fed into the small plane B, producing outputs that go into the instruction circuit. The outputs from A also are fed into C (through pass transistors).

Planes D and E can be considered the same plane, split apart for better layout. They share 11 input lines, and the remaining inputs are different between D and E. These inputs come from the ALU/register circuits on the left, as well as other parts of the chip. They also receive inputs from G - these inputs are not handled via normal PLA input lines, but are wired through transistors directly to the associated output lines, which makes the layout more compact.

Planes F and G provide outputs through pass transistors to the ALU/register circuits. These outputs probably control the actions and bus activity, but more analysis is needed.

The following diagram shows how the PLA planes are wired to the rest of the chip. Planes D and E in particular receive inputs from many parts of the chip. The outputs from F and G are very short because the displayed wires end at the nearby pass transistors to the left.

The transistors in the PLA

I have diagrams showing where the transistors are in each PLA grid here.

↧

Inside the ALU of the 8085 microprocessor

January 24, 2013, 11:07 pm

≫ Next: Silicon reverse engineering: The 8085's undocumented flags

≪ Previous: Notes on the PLA on the 8085 chip

The arithmetic-logic unit is a fundamental part of any computer, performing addition, subtraction, and logic operations, but how it works is a mystery to many people. I've reverse-engineered the ALU circuit from the 8085 microprocessor and explain how it works. The 8085's ALU is a surprisingly complex circuit that at first looks like a mysterious jumble of gates, but it can be understood if you don't mind diving into some Boolean logic.

The following diagram shows the location of the ALU in the 8085. The ALU is 8 bits wide, with the high-order bit on the left. The register file is the large block below the ALU. The registers are 16 bits wide, made up of pairs of 8-bit registers. Surprisingly, the register file has the high-order bit on the right, the opposite order from the ALU.

The ALU takes two 8-bit inputs, which I'll call A and X, and performs one of five basic operations: ADD, OR, XOR, AND, and SHIFT-RIGHT. As well, if the input X is inverted, the ALU can perform subtraction and complement operations. You might think SHIFT-LEFT is missing from this list. However, it is simply performed by adding the number to itself, which shifts it to the left one bit in binary. Note that the 8085 arithmetic operations are very basic. There is no multiplication or division operation - these were added in the 8086.

The ALU consists of 8 mostly-identical slices, one for each bit. For addition, each slice of the ALU adds the appropriate input bits, computing the sum A + X + carry-in, generating a sum bit and a carry-out bit. That is, each bit of the ALU implements a full adder. The logic operations simply operate on the two input bits: A AND X, A OR X, A XOR X. Shift-right simply outputs the A bit from the slice to the right.

ALU schematic

The following schematic shows one bit of the ALU. The schematic has roughly the same layout as the implementation on the chip, flowing from bottom to top. Eight of these circuits are stacked side-by-side, with the low-order bit on the right. Carries flow from right to left, and bits shifted right flow from left to right.

Negation

Starting at the bottom of the schematic, is the complex gate labeled Negation. This gate optionally selects a negated second argument by selecting either XN or /XN. (XN is the Nth bit of the second argument, which I'll call X. The / indicates the complement.) For most of the discussion below I'll assume XN is uncomplemented to keep things simpler.

Operation

Above the complement selector are a few gates labeled Operation that perform the desired 2-input operation. The NAND gate on the left generates either A NAND X or 1 based on the select_op1 control line. The OR gate on the right generates either A OR X or 1, based on the select_op2 control line. Combining these in the NAND gate yields four different possibilities:

select_op1	select_op2	Result
0	0	A NOR X
0	1	0
1	0	A NXOR X
1	1	A AND X

Note that instead of OR and XOR, the complemented value is produced by this circuit. This will be fixed in the next step.

Combine with carry

Above the operation circuit is the next block of gates labeled Combine with carry that generates the ALU output by merging the carry-in with the operation value via XOR.

To understand this circuit, first consider the following simple XOR circuit, which is used a couple times in the ALU. It can be understood fairly simply: if both inputs are 0 (top) or both inputs are 1 (bottom) then the output is 0.

Ignoring the shift_right circuit for a moment, the block of gates is simply the XOR circuit above. Note that XOR with 0 is a no-op, while XOR with 1 complements the value. And A XOR X XOR CARRY is the low-order bit of adding A, X, and CARRY.

The key point of this circuit is that the incoming carry is generated with the proper value to convert the operation output into the desired final result. The incoming carry /carry(N-1) is either 0, 1, or the complemented carry from bit N-1 as appropriate.

Op	Operation output	Carry	Result
or	A NOR X	1	A OR X
add	A NXOR X	/carry	A XOR X XOR CARRY
xor	A NXOR X	1	A XOR X
and	A AND X	0	A AND X
shift right	0	0	A(N+1)
complement	A NOR /X	1	A OR /X
subtract	A NXOR /X	/carry	A XOR /X XOR CARRY

Note that the carry-in line must have the right value in order to generate the appropriate output. For addition it passes the inverted carry from one bit to the next. But for OR, XOR, the line is set to 1. And for AND and SHIFT_RIGHT it is set to 0. As will be seen below, the carry circuitry generates the right value for the right operation.

The final aspect of this circuit is the shift-right circuit. With a 0 op input, 0 carry input, and shift_right set, the output is simply the bit from the right: A(N+1).

Generate carry

The circuit on the left, labeled Generate carry generates the carry out. It can generate three different outputs: 1, 0, or the (complemented) carry from the sum. If select_op2 is set, it will force the carry to 0. Otherwise if force_ncarry_1 is set, it will force the carry to 1. Otherwise, the carry is generated for the sum of A + X + carry-in through straightforward logic: If the carry-in is set, and one of the inputs is set, there will be a carry out. If both input bits are set, there will be a carry out.

Flags

The 8085 has a parity flag, which is 1 if the number of 1 bits is even, and 0 if the number of parity bits is odd. The parity flag is generated by XORing all the result bits together (and complementing). Each bit is XORed with the lower-order parity value by the parity circuit near the top of the schematic. The XOR circuit is the same circuit described above.

The zero flag is computed by a simple circuit: each result bit drives a transistor that will pull the zero line low if the bit is set. This forms an 8-input NOR gate, spread across the ALU.

The control lines

As seen in the schematic, the 8085 uses multiple control lines to control the activity inside the ALU. In total, the ALU provides 7 different operations and the following table summarizes the control lines that are used for each operation. It also lists the opcodes that use each ALU operation.

Operation	select_neg	select_op1	select_op2	shift_right	force_ncarry_1	Opcodes
or	0	0	0	0	1	ORA
add	0	1	0	0	0	INR,DCR,RLC,DAD,RAL,DAA,ADD,ADC,ADI,ACI
xor	0	1	0	0	1	XRA,XRI
and	0	1	1	0	1	ANA,ANI
shift right	0	0	1	1	1	RRC, RAR
complement	1	0	0	0	1	CMA
subtract	1	1	0	0	0	SUB,SBB,SUI,SBI,CMP,CPI

The ALU control lines are generated from the opcode by the programmable logic array. Specifically, they are outputs from PLA F, which is to the right of the ALU. More details are in my article on the PLA. The ALU has additional control lines to set up the registers, initialize the carry bits, and set the flags. These control the differences between different op codes, beyond the categories above. the I will explain those in a future article.

Reverse-engineering the ALU

This information is based on the 8085 reverse-engineering done by the visual 6502 team. This team dissolves chips in acid to remove the packaging and then takes many close-up photographs of the die inside. Pavel Zima converted these photographs into mask layer images, generated a transistor net from the layers, and wrote a transistor-level 8085 simulator.

I took the transistor net and used it to figure out how the ALU works. First, I converted the transistor net into gates. Next I figured out which gates are part of the ALU and put them into a schematic. Then I examined how the circuit worked for different operations and eventually figured out how it works.

Conclusion

The ALU of the 8085 is an interesting circuit. At first it seemed like an incomprehensible pile of gates with mysterious control lines, but after some investigation I figured it out. The 8085 ALU is implemented very differently from the 6502's ALU (which I'll write up later). The 6502's ALU uses fairly straightforward circuits to generate the SUM, AND, XOR, OR, and SHIFT values in parallel, and then uses a simple pass-transistor multiplexor to pick the desired operation. This is in contrast to the 8085 ALU, which generates only the desired value.

↧

Silicon reverse engineering: The 8085's undocumented flags

February 12, 2013, 10:16 pm

≫ Next: 8085 instruction set: the octal table

≪ Previous: Inside the ALU of the 8085 microprocessor

The 8085 microprocessor has two undocumented status flags: V and K. These flags can be reverse-engineered by looking at the silicon of the chip, and their function turns out to be different from previous explanations. In addition, the implementation of these flags shows that they were deliberately implemented, which raises the question of why there were not documented or supported by Intel. Finally, examining how these flag circuits were implemented in silicon provides an interesting look at how microprocessors are physically implemented.

Like most microprocessors, the 8085 has a flag register that holds status information on the results of an operation. The flag register is 8 bits: bit 0 holds the carry flag, bit 2 holds the parity, bit 3 is always 0, bit 4 holds the half-carry, bit 6 holds the zero status, and bit 7 holds the sign. But what about the missing bits: 1 and 5?

Back in 1979, users of the 8085 determined that these flag bits had real functions.[1] Bit 1 is a signed-number overflow flag, called V, indicating that the result of a signed add or subtract won't fit in a byte.[2] Bit 5 of the flag is poorly understood and has been given the names K, X5, or UI. For an increment/decrement operation it simply indicates 16-bit overflow or underflow. But it has a totally diffrent value for arithmetic operations. The flag has been described[1][3] as:

K =  O1·O2 + O1·R + O2·R, where:
O1 = sign of operand 1
O2 = sign of operand 2
R = sign of result
For subtraction and comparisons, replace O2 with complement of O2.

As I will show, that published description is mistaken. The K flag actually is the V flag exclusive-ored with the sign of the result. And the purpose of the K flag is to compare signed numbers.

The circuit for the K and V flags

The following schematic shows the reverse-engineered circuit for the K and V flags in the 8085. The V flag is simply the exclusive-or of the carry into the top bit and the carry out of the top bit. This is a standard formula for computing overflow[2] for signed addition and subtraction. (The 6502 computes the same overflow value through different logic.) The V flag has values for other arithmetic operations, but the values aren't useful.[4] A latch stores the value of the V flag. The computed V value is stored in the latch under the control of a store_v_flag control signal. Alternatively, the flag value can be read off the bus and stored in the latch under the control of the bus_to_flags control signal; this is how the POP PSW instruction, which pops the flags from the stack, is implemented. Finally, a tri-state superbuffer (the large triangle) writes the flag value to the bus when needed.

The K flag circuitry is on the right. The first function of the K flag is overflow/underflow for an INX/DEX instruction. This is implemented simply: the carry_to_k_flag control line sets the K flag according to the carry from the incrementer/decrementer. The next function of K flag is reading from the databus for the POP PSW instruction, which is the same as for the V flag. The final function of the K flag is the result of a signed comparison. The K flag is the exclusive-or of the V flag and the sign bit of the result. For subtraction and comparison, the K flag is 1 if the second value is larger than the first.[5] The K flag is set for other arithmetic operations, but doesn't have a useful value except for signed comparison and subtraction.[4]

The circuit in the 8085 for the undocumented V and K flags. The flags are generated from the carries and results from the ALU. The K flag can also be set by the carry from the incrementer/decrementer.

One mystery was the purpose of the K flag: "It does not resemble any normal flag bit."[1] Its use for increment and decrement is clear, but for arithmetic operations why would you want the exclusive-or of the overflow and sign? It turns out the the K flag is useful for signed comparisons. If you're comparing two signed values, the first is smaller if the exclusive-or of the sign and overflow is 1.[6] This is exactly what the K flag computes.

From the circuit above, it is clear that the V and K flags were deliberately added to the chip. (This is in contrast to the 6502, where undocumented opcodes have arbitrary results due to how the circuitry just happens to work for unexpected inputs.[7]) Why would Intel add the above circuitry to the chip and then not document or support it? My theory is that Intel decided they didn't want to support K or (8-bit) V flags in the 8086, so in order to make the 8086 source-compatible with the 8085, they dropped those flags from the 8085 documentation, but the circuitry remained in the chip.

The silicon

The 8085 microprocessor showing the data bus, ALU, flag logic, registers, and incrementer/decrementer.

The remainder of this article will show how the V and K flag circuits work, diving all the way down to the silicon circuits. The above image of the 8085 chip shows the layout of the chip and the components that are important to the discussion. In the upper left of the chip is the ALU (arithmetic-logic unit), where computations happen (details). The data bus is the main interconnect in the chip, connects the data pins (upper left), the ALU, the data registers, the flag register, and the instruction decoding (upper right). In the lower left of the chip is the 16-bit register file. Underneath the register file is a 16-bit increment/decrement circuit which handles incrementing the program counter, as well as supporting 16-bint increment and decrement instructions. The increment/decrement circuit has a carry-out in the lower right corner - this will be important for the discussion of the K flag. For some reason, the ALU has the low-order bit on the right, while the registers have the low-order bit on the left.

The flag logic circuitry sits underneath the ALU, with high-current drivers right on top of the data bus. The flags are arranged in apparently-random order with bit 7 (sign) on the left and bit 6 (zero) on the right. Because the carry logic is much more complicated (handling not only arithmetic operations but shifts and rotates, carry complement, and decimal adjust), the carry logic is stuck off to the right of the ALU where there was enough room.

Zooming in

Next we will zoom in on the V flag circuitry, labeled V1 above. Looking at the die under a microscope shows the metal layer of the chip, consisting of mostly-horizontal metal interconnects, which are the white lines below. The bottom part of the chip has the 8-bit data bus. Other wires are the VCC power supply, ground, and a variety of signals. While modern processors can have ten or more metal layers, the 8085 only has a single layer. Some of the circuitry underneath the metal is visible.

The metal layer of the 8085 microprocessor, zoomed in on the V flag circuit.

If the metal is removed from the chip, the silicon layer becomes visible. The blotchy green/purple is plain silicon. The pink regions are N-type doped silicon. The grayish regions are polysilicon, which can be considered as simply conductive wires. When polysilicon crosses doped silicon, it forms a transistor, which appears light green in this image. Note that transistors form a fairly small portion of the chip; there is a lot more connection and wiring than actual transistors. The small squares are vias, connections to the metal layer.

The V flag circuit in the 8085 CPU. This is the silicon/polysilicon after the metal layer has been removed. The data bus is not visible as it is in the metal layer, but it is in the lower third of the image. The rectangles at the bottom connect the data bus to the registers.

MOSFET transistors

For this discussion, a MOSFET can be considered simply a switch that closes if the gate input is 1 and opens if the gate input is 0. A MOSFET transistor is implemented by separating two diffusion regions, and putting a polysilicon wire over the gate. An insulating layer prevents any current from flowing between the gate and the rest of the transistor. In the following diagram, the n+ diffusion regions are pink, the polysilicon gate conductor is dull green, and the insulating oxide layer is turquoise.

NOR gate

The NOR gate is a fundamental building block in the 8085, since it is a very simple gate that can form more complex logic. A NOR gate is implemented through two transistors and a pullup transistor. If either input (or both) is 1, the corresponding transistor connects the output to ground. Otherwise, the transistors are open, and the pullup pulls the output high. The pullup is shown as a resistor in the schematic, but it is actually a type of transistor called a depletion-mode transistor for better performance.

By zooming in to a single NOR gate in the 8085, we can see how the gate is actually implemented. One surprise is that the circuit is almost all wiring; the transistors form a very small part of the circuit. The two transistors are connected to ground on the left, and tied together on the right. The pullup transistor is much larger than the other transistors for technical reasons.[8]

To understand the circuit, trace the path from ground to each transistor, across the gate, and to the output. In this way you can see there are two paths from ground to the output, and if either input is 1 the output will be 0.

The layout of the gate is intended to be as efficient as possible, given the constraints of where the power (VCC), ground, and other connections are, yielding a layout that looks a bit unusual. The power, ground, and input signals are all in the metal layer above (not shown here), and are connected to this circuit through vias between the metal and the silicon below.

A NOR gate in the 8085 microprocessor, showing the components.If either input is high, the associated transistor will connect the output to ground. Otherwise the pullup transistor will pull the output high.

Exclusive-or gates

The exclusive-or circuit (which outputs a 1 if exactly one input is 1) is a key component of the flag circuitry, and illustrates how more complex logic can be formed out of simpler gates. The schematic below shows how the exclusive-or is built from a NOR gate and an AND-NOR gate; it is straightforward to verify that if both inputs are 0 or both inputs are 1, the output is will be 0.

You may wonder why the 8085 uses so many "strange" gates such as a combined AND-NOR, instead of "normal" gates like AND. The transistor-level schematic shows that an AND-NOR gate can actually be implemented very simply with MOSFETs, in fact simpler than a plain AND gate. The two rightmost transistors form the "AND" - if they both have 1 inputs, they connect the output to ground. The transistor to the left forms the other part of the NOR - if it has a 1 input, it pulls the output to ground.

The following diagram shows an XOR circuit in the 8085 that matches the schematic above. (This is the XOR gate that generates the K flag.) On the left is the NOR gate discussed above, and on the right is the AND-NOR circuit, both outlined with a dotted line. As before, the circuit is mostly wiring, with the transistors forming a small part of the circuit (the green regions between pink diffusion regions).

An XOR gate in the 8085 microprocessor, formed from a NOR gate and an AND-NOR gate. If both inputs are 0, the NOR gate output will be 1, and the NOR transistor will pull the output to 0. If both inputs are 1, the AND transistors will pull the output to 0. Otherwise the pullup transistor will pull the output 1.

The flag latch

Each flag bit is stored in a simple latch circuit made up of two inverters. To store a 1, the inverter on the right outputs a 0, which is fed into the inverter on the left, which outputs a 1, which is fed back to the inverter on the right. A zero is stored in a similar (but opposite) manner. When the clock input is low, the pass transistor opens, breaking the feedback loop, and new data can be written into the latch. The complemented output (/out) is taken from the inverter.

You might wonder why the latch doesn't lose its data whenever the clock goes low. There's an interesting trick here called dynamic logic. Because the gate of a MOSFET consists of an insulating layer it has very high resistance. Thus, any electrical charge on the gate will remain there for some time[9] when the pass transistor opens. When the pass transistor closes, the charge is refreshed.

The latch used in the 8085 to store a flag value. The latch uses two inverters to store the data. When the clock is low, a new value can be written to the latch.

The following part of the 8085 chip shows the implementation of the latch for the V flag. The circuit closely matches the schematic above. The two inverters are outlined with dotted lines. The red arrows show the flow of data through the circuit. As before, the wiring and pullup transistors take up most of the silicon real estate.

Each flag in the 8085 uses a two-inverter latch to store the flag. This shows the latch for the undocumented V flag. The red arrows show the flow of data.

Driving the data bus with a superbuffer

Another interesting feature of the flag circuit is the "superbuffer". Most transistors in the 8085 only send a signal a short distance. However, to send a signal on the data bus across the whole chip takes a lot more power, so a superbuffer is used. In the superbuffer, one transistor is driven to pull the output low, while a second transistor is driven to pull the output high. (This is in contrast to a regular gate, which uses a depletion-mode pullup transistor to pull the output high.) In addition, these transistors are considerably larger, to provide more current.[8] These two transistors are shown at the bottom the schematic below.

The other feature of this superbuffer is that it is tri-state. In addition to a 0 or 1 output, it has a third state, which basically consists of providing no output. This way, the flags do not affect the data bus except when desired. In the schematic, it can be seen that if the control input is 1, both NOR gates will output 0, and both transistors will do nothing.

The superbuffer used in the 8085 to drive the data bus.

The following diagram shows the two drive transistors, as well as the line used to read the flag from the data bus. (The NOR gates are not shown.) Note the size of these transistors compared to transistors seen earlier. Each flag bit requires a superbuffer such as this. Even flag bit 3, which is always 0, requires a large transistor to drive the 0 onto the bus - it's surprising that a do-nothing flag still takes up a fair bit of silicon.

Each flag in the 8085 uses a superbuffer to drive the value onto the data bus. This figure shows the two large transistors that drive the V flag onto bit 1 of the data bus.

Putting it all together

The above discussion has shown the details of the XOR gate that computes the K flag, and the latch and superbuffer for the V flag. The following diagram shows how these pieces fit into the overall circuitry. The latch and driver for the K flag are outside this image, to the right. The circuits below are tied together by the metal layer, which isn't shown. Compare this diagram with the schematic at the top of the article to see how the components are implemented. The two XOR circuits look totally different, since their layouts have been optimized to fit with the signals they need.

The 8085 circuits to implement the undocumented V and K flags. The ALU provides /carry6, /carry7, and result7. The XOR circuit on the left generates V, and the XOR circuit in the middle generates K. On the right are the latch for the V flag, and the superbuffer that outputs the flag to the data bus. The K flag latch and superbuffer are to the right, not shown.

By looking at the silicon chip carefully, the transistors, gates, and complex circuits start to make sense. It's amazing to think that the complex computers we use are built out of these simple components. Of course, processors now are way more complex than the 8085, with billions of transistors instead of thousands, but the basic principles are still the same.

If you found this discussion interesting, check out my earlier analysis of the 6502's overflow flag and the 8085's ALU. You may also be interested in the book The Elements of Computing Systems, which describes how to build a computer starting with Boolean logic.

Credits

The chip images are from visual6502.org. The visual6502 team did the hard work of dissolving chips in acid to remove the packaging and then taking many close-up photographs of the die inside. Pavel Zima converted these photographs into mask layer images, a transistor net, and an 8085 simulator.

Notes and references

[1] The undocumented instructions and flags of the 8085 were discovered by Wolfgang Sehnhardt and Villy M. Sorensen in the process of writing an 8085 assembler, and were written up in the article Unspecified 8085 op codes enhance programming, Engineer's Notebook, "Electronics" magazine, Jan 18, 1979 p 144-145.

[2] See my article The 6502 overflow flag explained mathematically for details on overflow. There are multiple ways of computing overflow, and the 6502 uses a different technique.

[3] Tundra Semiconductor sold the CA80C85B, a CMOS version of the 8085. Interestingly, the undocumented opcodes and flags are described in the datasheet for this part: CA80C85B datasheet, 8000-series components.

The interesting thing about the Tundra datasheet is the descriptions of the "new" flags and instructions are copied almost exactly from Dehnhardt's article except for the introduction of errors, missing parentheses, and renaming the K flag as UI. In addition, as I described earlier, the published K/UI flag formula doesn't always work. Thus, it appears that despite manufacturing the chip, Tundra didn't actually know how these circuits worked.

[4] The V flag makes sense for signed addition and subtraction, and the K flag makes sense for signed subtraction and comparison. Many other operations affect these flags, but the flags may not have any useful meaning.

The V flag is 0 for RRC, RAR, AND, OR, and XOR operations, since these operations have constant carry values inside the ALU (details). The RLC and RAL operations add the accumulator to itself, so they can be treated the same as addition: V is set if the signed result is too big for a byte. The V flag for DAA can also be understood in terms of the underlying addition: V will only be set if the top digit goes from 7 to 8. However, since BCD digits are unsigned, V has no useful meaning with DAA. DAD is an interesting case, since the V flag indicates 16-bit signed overflow; it is actually computed from the result of the high-order addition. For INR, the only overflow case is going from 0x7f to 0x80 (127 to -128); note that going from 0xff to 0x00 corresponds to -1 to 0, which is not signed overflow even though it is unsigned overflow. Likewise, DCR sets the V flag going from hex 80 to 7f (-128 to 127); likewise 0x00 to 0xff is not signed overflow.

The K flag has a few special cases. For AND, OR, and XOR, the K flag is the same as the sign, since the V flag is 0. Note that the K flag is computed entirely differently for INR/DCR compared to INX/DCX. For INR and DCR, the K flag is S^V, which almost always is S. The K flag is set for DAA if S^V is true, which doesn't have any useful meaning since BCD values are unsigned.

The published formula for the K flag gives the wrong value for XOR if both arguments are negative.

[5] The following table illustrates the 8 possible cases when comparing signed numbers A and B. The inputs are the top bit of A, the top bit of B, and the carry from bit 6 when subtracting B from A. The outputs are the carry, borrow (complement of carry), sign, overflow, and K flags. An example is given for each row. Note that the K flag is set if A is less than B when treated as signed numbers.

Inputs			Outputs					Example
A₇	B₇	C₆	C	B	S	V	K	Hex	Signed comparison
0	1	0	0	1	0	0	0	0x50 - 0xf0 = 0x60	80 - -16 = 96
0	1	1	0	1	1	1	0	0x50 - 0xb0 = 0xa0	80 - -80 = -96
0	0	0	0	1	1	0	1	0x50 - 0x70 = 0xe0	80 - 112 = -32
0	0	1	1	0	0	0	0	0x50 - 0x30 = 0x120	80 - 48 = 32
1	1	0	0	1	1	0	1	0xd0 - 0xf0 = 0xe0	-48 - -16 = -32
1	1	1	1	0	0	0	0	0xd0 - 0xb0 = 0x120	-48 - -80 = 32
1	0	0	1	0	0	1	1	0xd0 - 0x70 = 0x160	-48 - 112 = 96
1	0	1	1	0	1	0	1	0xd0 - 0x30 = 0x1a0	-48 - 48 = -96

[6] A detailed explanation of signed comparisons is given in Beyond 8-bit Unsigned Comparisons by Bruce Clark, section 5. While this article is in the context of the 6502, the discussion applies equally to the 8085.

[7] The illegal opcodes in the 6502 are discussed in detail in How MOS 6502 Illegal Opcodes really work. In the 6502, the operations performed by illegal opcodes are unintended, just chance based on what the chip logic happens to do with unexpected inputs. In contrast, the undocumented opcodes in the 8085, like the undocumented flags, are deliberately implemented.

[8] The key parameter in the performance of a MOSFET transistor is the width to length ratio of the gate. Oversimplifying slightly, the current provided by the transistors is proportional to this ratio. (Width is the width of the source or drain, and length is the length across the gate from source to drain.) For an inverter, the W/L ratio of the pullup should be approximately 1/4 the W/L ratio of the input transistor for best performance. (See Introduction to VLSI Systems, Mead, Conway, p 8.) The result is that pullup transistors are big and blocky compared to pulldown transistors. Another consequence is that high-current transistors in a superbuffer have a very wide gate. The 8085 register file has some transistors where the W/L ratios are carefully configured so one transistor will "win" over the other if both are on at the same time. (This is why the 8085 simulator is more complex than the 6502 simulator, needing to take transistor sizes into account.)

[9] One effect of using pass-transistor dynamic buffers is that if the clock speed is too small, the charge will eventually drain away causing data loss. As a result the 8085 has a minimum clock speed of 500 kHz. Likewise, the 6502 has a minimum clock speed. The Z-80 in contrast is designed with static logic, so it has no minimum clock speed - the clock can be stepped as slowly as desired.

↧

8085 instruction set: the octal table

February 23, 2013, 10:46 am

≫ Next: The 8085's register file reverse engineered

≪ Previous: Silicon reverse engineering: The 8085's undocumented flags

The instruction set of the 8085 microprocessor has an underlying structure that becomes much clearer if expressed in an octal-based table, rather than usual hexadecimal-based table:

	\0_0	\0_1	\0_2	\0_3	\0_4	\0_5	\0_6	\0_7	\1_0	\1_1	\1_2	\1_3	\1_4	\1_5	\1_6	\1_7
\00_	NOP	LXI B,d16	STAX B	INX B	INR B	DCR B	MVI B,d8	RLC	MOV B,B	MOV B,C	MOV B,D	MOV B,E	MOV B,H	MOV B,L	MOV B,M	MOV B,A
\01_	dsub	DAD B	LDAX B	DCX B	INR C	DCR C	MVI C,d8	RRC	MOV C,B	MOV C,C	MOV C,D	MOV C,E	MOV C,H	MOV C,L	MOV C,M	MOV C,A
\02_	arhl	LXI D,d16	STAX D	INX D	INR D	DCR D	MVI D,d8	RAL	MOV D,B	MOV D,C	MOV D,D	MOV D,E	MOV D,H	MOV D,L	MOV D,M	MOV D,A
\03_	rdel	DAD D	LDAX D	DCX D	INR E	DCR E	MVI E,d8	RAR	MOV E,B	MOV E,C	MOV E,D	MOV E,E	MOV E,H	MOV E,L	MOV E,M	MOV E,A
\04_	RIM	LXI H,d16	SHLD a16	INX H	INR H	DCR H	MVI H,d8	DAA	MOV H,B	MOV H,C	MOV H,D	MOV H,E	MOV H,H	MOV H,L	MOV H,M	MOV H,A
\05_	ldhi r8	DAD H	LHLD a16	DCX H	INR L	DCR L	MVI L,d8	CMA	MOV L,B	MOV L,C	MOV L,D	MOV L,E	MOV L,H	MOV L,L	MOV L,M	MOV L,A
\06_	SIM	LXI SP,d16	STA a16	INX SP	INR M	DCR M	MVI M,d8	STC	MOV M,B	MOV M,C	MOV M,D	MOV M,E	MOV M,H	MOV M,L	HLT	MOV M,A
\07_	ldsi r8	DAD SP	LDA a16	DCX SP	INR A	DCR A	MVI A,d8	CMC	MOV A,B	MOV A,C	MOV A,D	MOV A,E	MOV A,H	MOV A,L	MOV A,M	MOV A,A
\20_	ADD B	ADD C	ADD D	ADD E	ADD H	ADD L	ADD M	ADD A	RNZ	POP B	JNZ a16	JMP a16	CNZ a16	PUSH B	ADI d8	RST 0
\21_	ADC B	ADC C	ADC D	ADC E	ADC H	ADC L	ADC M	ADC A	RZ	RET	JZ a16	rstv	CZ a16	CALL a16	ACI d8	RST 1
\22_	SUB B	SUB C	SUB D	SUB E	SUB H	SUB L	SUB M	SUB A	RNC	POP D	JNC a16	OUT d8	CNC a16	PUSH D	SUI d8	RST 2
\23_	SBB B	SBB C	SBB D	SBB E	SBB H	SBB L	SBB M	SBB A	RC	shlx	JC a16	IN d8	CC a16	jnk a16	SBI d8	RST 3
\24_	ANA B	ANA C	ANA D	ANA E	ANA H	ANA L	ANA M	ANA A	RPO	POP H	JPO a16	XTHL	CPO a16	PUSH H	ANI d8	RST 4
\25_	XRA B	XRA C	XRA D	XRA E	XRA H	XRA L	XRA M	XRA A	RPE	PCHL	JPE a16	XCHG	CPE a16	lhlx	XRI d8	RST 5
\26_	ORA B	ORA C	ORA D	ORA E	ORA H	ORA L	ORA M	ORA A	RP	POP PSW	JP a16	DI	CP a16	PUSH PSW	ORI d8	RST 6
\27_	CMP B	CMP C	CMP D	CMP E	CMP H	CMP L	CMP M	CMP A	RM	SPHL	JM a16	EI	CM a16	jk a16	CPI d8	RST 7

The large-scale structure of the instruction set is by quadrant (i.e. the top two bits): MOV instructions in the pink quadrant, arithmetic instructions in the cyan quadrant, increment, decrement, rotates in the yellow quadrant, and control flow (jump, call, return, push, pop, rst) in the purple quadrant. It's not totally regular, of course. Some instructions are wedged in where they can fit, for example the spot where memory-to-memory move (MOV M, M) would go is replaced by HLT.

Note how registers are controlled by an octal digit in the sequence B, C, D, E, H, L, M, and A. This is especially notable for the MOV instructions and arithmetic instructions. For instructions acting on register pairs, the structure is similar: BC, BC, DE, DE, HL, HL, SP, SP.

Although octal is unpopular now, early microprocessors were designed with octal in mind, using groups of three bits to select registers and operations. Now hexadecimal is popular, but when the opcodes are displayed in a hex-based table, the underlying structure of the instructions is obscured.

Note that the four blocks have been arranged for ease of display - strictly speaking they should be stacked vertically rather than a 2x2 grid. The table includes undocumented instructions, which are shown in lower case. Mouse over a cell to see the hex value of the instruction. Credits: original data from pastraiser.com 8085 instruction table.

How the 8085 decodes instructions internally

The 8085 uses a set of PLAs to decode and process instructions. In the first step of processing an instruction the instruction decode ROM (details) decodes the instruction into one of 48 different instruction groups. The grid below is colored according to the instruction group (0 through 47).

NOP

LXI B,d16
42

STAX B
40

INX B
36

INR B
38

DCR B
38

MVI B,d8
14

RLC
25

MOV B,B
45

MOV B,C
45

MOV B,D
45

MOV B,E
45

MOV B,H
45

MOV B,L
45

MOV B,M
44

MOV B,A
45

dsub
21

DAD B
20

LDAX B
41

DCX B
37

INR C
38

DCR C
38

MVI C,d8
14

RRC
25

MOV C,B
45

MOV C,C
45

MOV C,D
45

MOV C,E
45

MOV C,H
45

MOV C,L
45

MOV C,M
44

MOV C,A
45

arhl
24

LXI D,d16
42

STAX D
40

INX D
36

INR D
38

DCR D
38

MVI D,d8
14

RAL
25

MOV D,B
45

MOV D,C
45

MOV D,D
45

MOV D,E
45

MOV D,H
45

MOV D,L
45

MOV D,M
44

MOV D,A
45

rdel
22

DAD D
20

LDAX D
41

DCX D
37

INR E
38

DCR E
38

MVI E,d8
14

RAR
25

MOV E,B
45

MOV E,C
45

MOV E,D
45

MOV E,E
45

MOV E,H
45

MOV E,L
45

MOV E,M
44

MOV E,A
45

RIM
3

LXI H,d16
42

SHLD a16
12

INX H
36

INR H
38

DCR H
38

MVI H,d8
14

DAA
6

MOV H,B
45

MOV H,C
45

MOV H,D
45

MOV H,E
45

MOV H,H
45

MOV H,L
45

MOV H,M
44

MOV H,A
45

ldhi r8
23

DAD H
20

LHLD a16
13

DCX H
37

INR L
38

DCR L
38

MVI L,d8
14

CMA
6

MOV L,B
45

MOV L,C
45

MOV L,D
45

MOV L,E
45

MOV L,H
45

MOV L,L
45

MOV L,M
44

MOV L,A
45

SIM
3

LXI SP,d16
42

STA a16
8

INX SP
36

INR M
39

DCR M
39

MVI M,d8
16

STC
6

MOV M,B
43

MOV M,C
43

MOV M,D
43

MOV M,E
43

MOV M,H
43

MOV M,L
43

HLT
47

MOV M,A
43

ldsi r8
23

DAD SP
20

LDA a16
9

DCX SP
37

INR A
38

DCR A
38

MVI A,d8
14

CMC
6

MOV A,B
45

MOV A,C
45

MOV A,D
45

MOV A,E
45

MOV A,H
45

MOV A,L
45

MOV A,M
44

MOV A,A
45

ADD B
1

ADD C
1

ADD D
1

ADD E
1

ADD H
1

ADD L
1

ADD M
4

ADD A
1

RNZ
19

POP B
27

JNZ a16
29

JMP a16
30

CNZ a16
33

PUSH B
26

ADI d8
2

RST 0
5

ADC B
1

ADC C
1

ADC D
1

ADC E
1

ADC H
1

ADC L
1

ADC M
4

ADC A
1

RZ
19

RET
18

JZ a16
29

rstv
7

CZ a16
33

CALL a16
34

ACI d8
2

RST 1
5

SUB B
1

SUB C
1

SUB D
1

SUB E
1

SUB H
1

SUB L
1

SUB M
4

SUB A
1

RNC
19

POP D
27

JNC a16
29

OUT d8
17

CNC a16
33

PUSH D
26

SUI d8
2

RST 2
5

SBB B
1

SBB C
1

SBB D
1

SBB E
1

SBB H
1

SBB L
1

SBB M
4

SBB A
1

RC
19

shlx
10

JC a16
29

IN d8
15

CC a16
33

jnk a16
31

SBI d8
2

RST 3
5

ANA B
1

ANA C
1

ANA D
1

ANA E
1

ANA H
1

ANA L
1

ANA M
4

ANA A
1

RPO
19

POP H
27

JPO a16
29

XTHL
35

CPO a16
33

PUSH H
26

ANI d8
2

RST 4
5

XRA B
1

XRA C
1

XRA D
1

XRA E
1

XRA H
1

XRA L
1

XRA M
4

XRA A
1

RPE
19

PCHL
32

JPE a16
29

XCHG
46

CPE a16
33

lhlx
11

XRI d8
2

RST 5
5

ORA B
1

ORA C
1

ORA D
1

ORA E
1

ORA H
1

ORA L
1

ORA M
4

ORA A
1

RP
19

POP PSW
27

JP a16
29

DI
0

CP a16
33

PUSH PSW
26

ORI d8
2

RST 6
5

CMP B
1

CMP C
1

CMP D
1

CMP E
1

CMP H
1

CMP L
1

CMP M
4

CMP A
1

RM
19

SPHL
28

JM a16
29

EI
0

CM a16
33

jk a16
31

CPI d8
2

RST 7
5

Colors by iWantHue

The internal decoding shown above reveals a few interesting things. The NOP instruction is literally no operation - it doesn't get decoded into any instruction group. The MOV instructions are all decoded together, except for the memory operations. Similarly, the arithmetic instructions are all grouped together, except for the memory instructions. There are other smaller groups (e.g. INR/DCR, conditional jumps, conditional calls, returns), and 21 instructions that are handled uniquely(e.g. CALL, PCHL, XCHG, HALT, and 6 undocumented instructions). Surprisingly, DAA, CMA, STC, and CMC are handled together at this stage, despite having very different actions.

↧

The 8085's register file reverse engineered

March 2, 2013, 3:35 pm

≫ Next: Wealth distribution in the United States

≪ Previous: 8085 instruction set: the octal table

On the surface, a microprocessor's registers seem like simple storage, but not in the 8085 microprocessor. Reverse-engineering the 8085 reveals many interesting tricks that make the registers fast and compact. The picture below shows that the registers and associated control circuitry occupy a large fraction of the chip, so efficiency is important. Each bit is implemented with a surprisingly compact circuit. The instruction set is designed to make register accesses efficient. An indirection trick allows quick register exchanges. Many register operations use the unexpected but efficient data path of going through the ALU.

While the 8085's register complement is tiny compared to current processors, it has a solid register set by 1977 standards - about twice as many registers as the 6502. The 8085 has a 16-bit program counter, a 16-bit stack pointer, 16-bit BC, DE, and HL register pairs, and the 8-bit accumulator. The 8085 also has little-known hidden registers that are invisible to the programmer but used internally: the WZ register pair, and two 8-bit registers for the ALU: ACT and TMP.

Photograph of the 8085 chip showing components relevant to register operations.

The register file is in the lower left quadrant of the chip. It contains the 6 register pairs and associated circuitry. Underneath the registers is the 16-bit address latch and increment/decrement circuit. The register file is controlled by a set of control lines on the right, which are driven by register control logic circuits and the register control PLA. The current instruction is loaded into the instruction register (upper right) via the data bus. In the upper left is the 8-bit arithmetic-logic unit (ALU), with the accumulator and two temporary registers (ACT and TMP).

The 8085 has only 40 pins (visible around the edge of the image) to communicate with the outside world, a tiny number compared to current microprocessors with more than 1000 pins. For memory accesses, the 8085 reads or writes 8 bits of data using a 16-bit memory address (for a maximum of 64K of memory). In the image above, memory addresses flow through the 16-bit address bus (abus) provides memory addresses, while data flows through the chip over the 8-bit data bus (dbus). The 8 A pins handle half of the address, while the 8 AD pins are used both for the other half of the address and for data (at different times). This frees up pins for other uses, but makes computers using the 8085 slightly more complicated. In comparison, the 6502 is more straightforward, with separate pins for address and data.

Overall architecture of the register file

The diagram below shows the implementation of the 8085 register file in the same layout as on the actual chip. The 8-bit data bus is at the top, and the 16-bit address bus is at the bottom. The register control lines are on the right.

In the middle are the registers, arranged as pairs of 8-bit registers. Note that the registers are arranged "backwards" with the high-order bit on the right and the low-order bit on the left. The 16-bit program counter and stack pointer are first. Next is the WZ temporary register, and underneath it the BC register pair. The HL and DE register pairs are at the bottom - these registers do not have fixed locations, but can swap roles during execution. A 16-bit register bus (regbus) provides access to the registers.

Underneath the registers is the address latch, which holds a 16-bit value that is written to the address bus. This value is also the input to the 16-bit increment/decrement circuit. The output of the incrementer/decrementer can be written back to the registers.

The triangles indicate tri-state buffers, basically switches that control the flow of data. Buffers containing a + are amplifiers to boost the weak signals from the registers. Buffers containing a S are superbuffers, that provide extra current to send data across the long data bus.

Architecture diagram of the 8085 register file, as it is implemented on the chip. The register file is connected to the data bus at top, and address bus at bottom. The control lines are along the right.

The picture below zooms in on the chip image above, showing the register file in detail. The components in silicon exactly map onto the diagram above. Note the repeated patterns for the 16-bit circuits. The large transistors used as high-current drivers are clearly visible. The transistors in each bit of register storage are much smaller.

A closeup of the 8085 microprocessor, showing the details of the register file and the locations of the major components.

Storing bits in the register file

The implementation of the 8085 registers is unusual in several ways. The registers don't have explicit read and write modes; instead the register will be overwritten if there is a stronger signal on the bus. Instead of having a bus with one wire for each bit, the 8085 uses a sort of differential bus, with two wires for each bit: one wire transmits the value, and the other transmits the complement of the value.

Each bit consists of two inverters in a feedback loop, with pass transistors to connect the inverters to the bus. An unusual feature of this is the lack of any circuit to break the feedback loop when modifying the register (unlike the 6502). Instead, the 8085 uses a "might makes right" technique - if a stronger signal is written to the bus, it will overwrite a register connected to the bus. The transistors driving the register bus are about twice as large as the transistors in the inverters, so they can forcibly overwrite the inverter loop.

One consequence of this register implementation is that a register can't be copied directly to another register, since there's nothing to distinguish the source register from the destination register - each register could potentially damage the other's bits. To get around this, the 8085 uses an interesting trick - copies are actually done through the ALU, as will be explained later.

One bit of a register in the 8085 register file. Each bit is stored in two inverters in a feedback loop. The register bus uses two lines of opposite polarity for each bit. Access to the register is controlled by the reg_rw control line, which connects the inverters to the bus, allowing the value to be read or written.

The image below zooms in on the chip closer, showing the silicon for six individual register bits. The schematic for one bit is overlaid, as are some of the metal lines providing power, ground, and the register bus. Each bit consists of two transistors for the inverters, two depletion pullup transistors for the inverters (shown as resistors), and two pass transistors connecting the bit to the register bus. The pink regions are transistors, with the green strips the gates (details).

Detail of the 8085 chip showing six bits in the 8085's register file. Bit 2 of the stack pointer is shown with schematic. The two transistors form two inverters in a feedback loop. The light blue lines are the metal layer wires connected to bit 2. The program counter is in the upper half of the image.

To read a register, an amplifier circuit is used to boost the signal from the differential register bus to write it to the dbus or address latch. I assume this is a tradeoff to make the register file smaller. Each inverter pair can be made as small as possible, but then requires amplification to produce a signal strong enough for use elsewhere in the chip. The amplification circuit that drives the data bus is more complex than I'd expect, probably because of the extra power to drive the bus (details and schematic).

The incrementer/decrementer

The 16-bit incrementer/decrementer at the bottom of the register file is used for multiple purposes. It increments the program counter as instructions execute, increments and decrements the stack pointer as needed, and supports the 16-bit increment and decrement instructions.

An interesting feature of the incrementer is it also supports incrementing by 2, which is used to quickly skip over the two byte address in a call or jump not taken. This allows these operations to complete faster on the 8085 than the 8080.

Two bits of the 16-bit increment/decrement circuit in the 8085. Odd bits and even bits use a different circuit for efficiency. The carry out from even bits is complemented.

The incrementer/decrementer is implemented by a chain of adders with ripple carry - the carry from each bit flows into the adder for the next bit. (The above schematic shows two bits, and is repeated 8 times in the full circuit.) The DREG_INC and DREG_DEC control lines select increment or decrement. One performance trick is that alternating bits are implemented with different circuits and the carry out of even bits is inverted. This avoids the inverters that would otherwise be needed to flip the carry back to its regular state. This saves space, but even more importantly it speeds up carry propagation. Because the carry has to propagate bit-by-bit through all 16 bits to generate the final result, adding an inverter to each bit would slow it down significantly. The carry out is used to compute the undocumented K flag value (details).

In comparison, the 6502 has a 16-bit incrementer (no decrement) used exclusively by the program counter. To reduce the carry propagation speed, this incrementer uses a carry-skip. That is, the carry out of the low-order byte is immediately generated and fed into the high-order byte. Thus the carries only need to propagate through 8-bits, the two bytes working in parallel. (The carry is easily generated by ANDing together the low-order bits. If they are all 1, there will be a carry into the high-order byte.)

The WZ Temporary registers

The WZ register pair in the 8085 is used for temporary storage, but is invisible to the programmer. Internally, the WZ register pair is implemented like the other register pairs.

The primary use of WZ is to hold operands from a two or three byte instruction until it can be used. The WZ registers are used to hold 16-bit addresses for LDA, STA, LHLD, JMP, CALL, and RST instructions. The registers hold the port for IN and OUT. The WZ register pair can also temporarily hold information read from memory. The registers hold the address popped off the stack for RET. For XTHL, the registers hold the value from the stack.

Register decoding and the instruction set

The instruction set of the 8085 is organized so an instruction can be quickly and easily decoded to determine the instruction to use. The underlying structure for most 8085 instructions is the octal bit pattern bbDDDSSS, where destination bits DDD and/or source bits SSS select the register usage. The move (MOV) instructions follow this structure. Other instructions (e.g. INR) use just the DDD bits to select the register, while math instructions use the three SSS bits. Some instructions only use DDD or SSS, and some instructions operate on register pairs so they don't use the lowest bit. This instruction pattern is visible if the instructions are arranged in an instruction table according to their octal values.

The three bits select the register as follows:

D₂D₁D₀	Register
000	B
001	C
010	D
011	E
100	H
101	L
110	M
111	A

M indicates a memory operation and is treated as a pseudo-register in the instruction set. Some instructions (e.g. INX) use the top two bits to select a register pair: BC, DE, HL, or "special" (stack pointer or accumulator). Note that in the table above the low-order bit selects a register out of a register pair.

This instruction set structure allows simple logic to control the registers. A multiplexer pulls out the right group of three bits, depending on the instruction and the cycle in the instruction (link to schematic). These three bits are then used to pick the specific register control lines to activate at each step.

The registers are controlled by about 18 control lines that affect the movement of data and the operation of the incremented/decrementer. The following table summarizes the control lines.

`/RREG_RD`	Reads the right-hand side register bus onto the data bus. This implements the multiplexing of 16-bit registers onto the 8-bit data bus.
`/LREG_RD`	Reads the left-hand side register bus onto the data bus.
`LREG_WR`	Writes the data bus to the left-hand side register bus. This implements the demultiplexing of the 8-bit data bus to the 16-bit registers.
`RREG_WR`	Writes the data bus to the right-hand side register bus.
`REG_PC_RW`	Connects the PC to the register bus.
`REG_SP_RW`	Connects the SP to the register bus.
`REG_WZ_RW`	Connects the WZ register pair to the register bus.
`REG_BC_RW`	Connects the BC register pair to the register bus.
`REG_HL_RW`	Connects the HL (DE) register pair to the register bus.
`REG_DE_RW`	Connects the DE (HL) register pair to the register bus.
`DREG_WR`	Writes the output of the incrementer/decrementer to the register bus.
`DREG_RD`	Reads the register bus into the address latch.
`/DREG_RD`	Inverted DREG_RD.
`DREG_DEC`	Incrementer/decrementer performs decrement.
`DREG_INC`	Incrementer/decrementer performs increment.
`CARRY_OUT`	The carry/borrow out from the incrementer/decrementer.
`DREG_CNT`	Increment/decrement by 1.
`DREG_CNT2`	Increment/decrement by 2.

The first step in register control is the register control PLA, which generates 19 control signals based on the instruction type and the cycle step. The register control logic (between the register file and the PLA) mixes in the register selection bits as appropriate (and a few other inputs) to generate the register control lines listed above.

For instance, REG_BC_RW control line is activated if the PLA indicates a register access and the register bits are 00x. The RREG_RD control line is activated for a single-register read instruction if the register bits are xx0, and LREG_RD is activated if the bits are xx1. Both control lines are activated at the same time if the PLA indicates a register pair read.

The DE/HL exchange trick

The XCHG instruction exchanges the contents of the HL register pair with the contents of the DE register pair in a single M-cycle. You might wonder how the registers can be exchanged so quickly. It turns out that this instruction is implemented with a trick - an extra level of indirection.

Although most 8085 architecture diagrams label one register pair as DE and another as HL, this isn't exactly true. In fact, the 8085 has two register pairs and either one can be the DE or HL pair. A status flip flop keeps track of which pair is DE and which is HL. As Pavel Zima figured out, the XCHG instruction doesn't move any data; it simply toggles the flip flop. The data remains in the same place, but the DE register is now HL and vice versa. Thus, the XCHG instruction is completed quickly. The consequence is every use of DE or HL uses this flip flop to determine which register to access (link to schematic).

Using the ALU to move registers

You wouldn't expect the ALU (arithmetic-logic unit) to take part in a register-to-register move, but it happens in the 8085. Many register operations take advantage of the ALU's temporary registers.

The ALU doesn't directly operate on the accumulator and input register. Instead, the accumulator is copied to the ACT (Accumulator Temporary) register and the other input is copied to the TMP register. This way, the result can be written to the accumulator without the race condition that would occur if the accumulator were an input and output at the same time.

For register moves, the source value is copied to the TMP register, the ACT register is set to 0, and the ALU performs an OR operation (ALU details), writing the result (i.e. the source value) to the dbus. This result can then be stored to the register file during a later cycle.

The register file in action

The step-by-step operation of the register file is surprisingly complex. One complication is that the register file and buses must handle stepping the program counter, fetching the instruction, and performing any register moves, without interference. A second complication is that register moves go through the ALU as described above.

Stepping through an operation in detail will show the complexity of the register operations. The following shows the data flow for a MOV B,E instruction, which copies the contents of the E register into the B register.

To understand this table, a bit of background on 8085 instruction timing. An instruction cycle is broken down into one or more M (machine) cycles, where an 8-bit memory access can be done in one M cycle. Each M cycle is broken down into several T-states, where each T-state corresponds to one clock cycle. Each clock cycle has a low phase and a high phase.

The single-byte register-to-register MOV instruction takes one M cycle (M1), 4 T cycles, or 8 clock phases. Each clock phase is a separate line in the table. To make things more complicated, the activity for an instruction isn't entirely within its own instruction cycle. To improve performance, the 8085 uses simple pipelining, where the M1 opcode fetch of the next instruction overlaps with completion of the previous instruction.

The MOV B, E instruction (which copies the E register to the B register) is illustrated in the table below. The PC is copied to the incrementer latch at the end of the previous operation, and then is written to the address pins during the T1 cycle. The PC is updated with the incremented value at the end of the T2 cycle.

The instruction opcode is fetched in the T3 cycle, and at this point execution can start on the instruction. It's not until the T1 cycle of the next instruction that the register file swings into action. The E register is written to the dbus at the end of the T1 cycle. Then the ALU's TMP register is loaded from the dbus. The ALU's other argument, the ACT register is 0 at this point, and the ALU is configured to perform an OR operation. At the end of the (next instruction's) T3 cycle, the result of the ALU operation (i.e. the E register) is stored in the B register via the dbus. Meanwhile, the next instruction is getting fetched (grayed out).

Cycle	T/clock	PC action	Register action
	T4/0
	T4/1	PC → inc latch
M1 opcode fetch	T1/0	inc latch → address pins
	T1/1	inc latch → address pins
	T2/0
	T2/1	inc → PC
	T3/0		data pins → dbus → instruction reg
	T3/1
	T4/0
	T4/1	PC → inc latch
M1 opcode fetch	T1/0	inc latch → address pins
	T1/1	inc latch → address pins	E reg → dbus
	T2/0		dbus → TMP reg
	T2/1	inc → PC
	T3/0		data pins → dbus → instruction reg
	T3/1		ALU → dbus → B reg

Each step in the table above is activated by the appropriate register control lines. For instance, in T2/1, the PC is updated by triggering the reg_pc_rw and dreg_wr lines.

Conclusion

The 8085 has a complex register set, and it uses some interesting tricks to reduce the size of the chip and to optimize some operations. The register set is much harder to understand than I expected, but with careful examination it reveals its secrets.

Credits: The chip images are from visual6502.org. The visual6502 team did the hard work of dissolving chips in acid to remove the packaging and then taking many close-up photographs of the die inside. Pavel Zima converted these photographs into mask layer images, a transistor net, an 8085 simulator, and register file schematics (top, bottom).

See discussion at Hacker News. Thanks for visiting!

↧

Wealth distribution in the United States

March 5, 2013, 12:09 am

≫ Next: Tenma 72-7740 multimeter: review and teardown

≪ Previous: The 8085's register file reverse engineered

Today's Forbes billionaires list inspired me to visualize the wealth inequality in the United States. Use the Forbes list and other sources, I've created a graph that shows wealth distribution in the United States. It turns out that if you put Bill Gates on a linear graph of wealth, pretty much the entire US population is crammed into a one-pixel bar around 0.

This graph shows the wealth distribution in red. Note that the visible red line is one pixel wide and disappears everywhere else - this is the key point: essentially the entire US population is in that first bar. The graph is drawn with the scale of 1 pixel = $100 million in the X axis, and 1 pixel = 1 million people in the Y axis. Away from the origin, the red line is invisible - less than 1/1000 of a pixel tall since so few people have more than $100 million dollars. It's striking just how much money Bill Gates has; even $100 million is negligible in comparison.

Since the median US household wealth is about $100,000, half the population is crammed into a microscopic red line 1/1000 of a pixel wide. (The line would be narrower than the wavelength of light so it would be literally invisible). And it turns out the 1-pixel-wide red line isn't just the "99%", but the 99.999%. I hypothesize this is why even many millionaires don't feel rich.

Wealth inequality among billionaires

Much has been written about inequality in the US between the rich and the poor, but it turns out there is also huge inequality among the ranks of billionaires. Looking at the 1.9 trillion dollars held by US billionaires, it turns out that the top 20% of billionaires have 59% of this wealth, while the bottom 20% of billionaires have less than 6%. So even among billionaires, most of the money is skewed to the top. (I originally pointed this out in Forbes in 1998, and the billionaire inequality has grown slightly since then.)

Sources

The billionaire data is from Forbes billionaires list 2013. Median wealth is from Wikipedia. Also Measuring the Top 1% by Wealth, Not Income and More millionaires despite tough times. Wealth data has a lot of sources of error including people vs households, what gets counted, and changing time periods, but I've tried to make this graph as accurate as possible. I should also mention that wealth and income are two very different things; this post looks strictly at wealth.

↧

Tenma 72-7740 multimeter: review and teardown

April 15, 2013, 11:19 pm

≫ Next: Teardown and exploration of Apple's Magsafe connector

≪ Previous: Wealth distribution in the United States

The Tenma 72-7740 digital multimeter is a multimeter in the $70 price range. Overall, it's a nice, solidly-build meter and it has performed well for me. I received this DMM from Newark element14 for review; in this article I describe its functionality followed by a teardown.

What you get

What comes in the box with the Tenma 72-7740 DMM: temperature probe, battery, alligator clips, and probes.

The DMM comes with a temperature probe, battery, alligator clips, and test probes. Note that the test probes have very short metal tips, unlike the long tips on most probes. The alligator clip probes are a nice addition. My biggest complaint with the DMM is the temperature probes connections are soldered with no strain relief so I worry the wires will break off.

The DMM also comes with a pocket-sized 36 page operating manual - a real, physical manual on paper, not a PDF file like most products these days. The DMM doesn't really need a manual - functions work pretty much as you'd expect - but it's nice to have the manual.

Specifications

The DMM is autoranging with maximum reading of 3999. It is full-size (177mm × 85mm × 40mm), not a pocket DMM, and has a built-in stand. The LCD display is large and clear and has a backlight, which is nice if I ever end up using the meter in the dark. It has 10M&ohm; input impedance and maximum voltages of 1000V DC and 750V AC. The top current range is 10A.

The DMM also includes capacitance, diode, temperature, frequency, and duty cycle measurements. The top capacitance range (100µF) can take up to 15 seconds to get a measurement, so be patient with those big electrolytics. The lowest capacitance range is 40 nF with 10pF resolution claimed. The DMM also has a continuity buzzer, although I find the sound crackles a bit. The temperature readings are only in °C; I know using Fahrenheit makes me a bad person, but that's what I need to check my appliances. The temperature range is -40°C to 1000°C. The upper range is hotter than I need, but since I sometimes go outside below -40°C my multimeter should be able to handle it too.

Buttons provide hold and relative mode. The meter goes into sleep mode after 30 minutes.

I don't have the equipment to measure the accuracy of the DMM myself, so I'm going off the published values. The specification for DC voltage accuracy is a reasonable ±0.8%; the considerably more expensive Fluke 177 has ±0.09% accuracy, so you get what you pay for.

The function knob has 7 positions: V, resistance/capacitance/diode/continuity, Hertz, °C, µA, mA, and A. The blue function button switches between AC and DC or switches among resistance, capacitance, diode, and continuity.

There are a few functions found in more advanced multimeters that aren't found here: min/max measurement, RS-232 support, °F, a 4nF capacitance scale, and an analog bar graph.

For full specifications, see the specification chart.

Tenma 72-7740 digital multimeter measuring 60Hz line frequency

Teardown

Of course I was interested in what was inside the multimeter and opened it up. The instruction manual describes how to remove the screws under the feet. The force required to pry the case apart made me a little nervous, but it snapped open without breaking anything. Note that the case must also be opened in this way to replace the fuses - they are not accessible from the battery compartment. Unfortunately I tend to blow fuses a lot measuring charger performance, but this may motivate me to be more careful.

A foil shield covers most of the circuit board, with holes for some adjustments. Near the bottom is a thick wavy wire, which is the precision resistor used for the high-current measurements, a fraction of an ohm. There's nothing particularly interesting directly under the foil shield; almost all the components are on the other side.

Removing the circuit board and flipping it over shows the circuitry. The large LCD display is at the top, with the pushbuttons below. The most visually striking part of the board is the round circuitry for the function knob, which I will explain in more detail below. To the left are three precision (blue) resistors for mA and µA measurement. Below are 5 diodes which I believe are for input protection. The large black cylinder in the lower right appears to be a spark gap to protect the input from high voltages - the DMM is rated to 1000V DC overload protection. Below it is a large yellow PTC resistor to protect from input overloads. The 8-pin IC is a STMicroelectronics TL062C low power J-FET dual op amp.

The circuit board for the Tenma 72-7740 DMM.

Underneath the LCD is the 100-pin controller IC and a bunch of SMD components. The Semico CS7721CN chip powers the Tenma 72-7740 DMM. I wasn't expecting that a DMM chip would need 100 pins, but that seems to be common. I couldn't find a datasheet for this specific chip, but datasheets for other similar chips (such as the Fortune FS9721 and Cyrustek ES51982) give an idea of how digital multimeters works. The chip has signal inputs for the different functions (voltage, current, resistance, frequency, capacitance, etc.) The four blue precision resistors below the chip divide the input by powers of 10 as appropriate. Six mode pins are connected to the function selector switch to select the appropriate function, as will be displayed below. The function pushbuttons are also connected to the IC. About 17 pins from the IC drive the LCD segments. The crystal provides accurate timing, which is critical for the accuracy of the dual-slope analog-to-digital converter that measures the input.

The 100-pin Semico CS7721CN chip powers the Tenma 72-7740 DMM.

How the function selector switch works

Rotary switches have always been mysterious to me. The pattern on the circuit board seems to be made up of random lines rather than any obvious switch contacts, and looks as much like an Aztec symbol as a switch. So I figured it was time to dive in and figure out how it works.

The selector knob has 7 positions, rotating a bit under 180 degrees in total. Looking at the back of the selector knob, you can see six independent sliders for six separate switching circuits. Each slider has two peaks in the middle, which bridge two contacts on the circuit board. Note that the two outermost sliders are offset 90 degrees from the others. Since the knob turns a bit under 180°, the sliders n

The following diagram shows how the switches work. Each of the six colored semi-circular rings is associated with one of the sliders. The seven lines inside each semicircle indicate the seven possible positions of the associated slider. The most counterclockwise position in each ring is the V setting, followed by &ohm;, Hz, °C, µA, mA, and A. If there are two traces lined up with the slider, the slider will connect the two traces.

One surprise is that many of the traces don't actually form a circuit. The highlighted positions in the diagram are active positions that close two contacts, but the other positions don't form a connection. In particular, the red ring is only active in one position, and the blue ring in two positions. Many positions have the same circuit trace on both ends of the line, which means the switch does nothing and the trace is unnecessary. My guess is that the redundant metal is there because metal-on-metal is lower friction than metal-on-circuit-board.

The white, cyan, and blue rings ground various combinations of "mode" pins on the IC to select the function. The left half of the purple ring directs the µAmA°C input to the appropriate circuit based on the function. The right half of the purple ring directs the HzV&ohm; input appropriately. The red ring has a connection only for the °C setting. The orange ring makes connections for &ohm;, °C, and A.

Conclusion

The 72-7740 DMM is a solid meter that gets the job done and I have only minor complaints. It currently sells for about $70. Inexplicably, the next model up, the 7745, is cheaper despite having true RMS and a serial RS-232 output. The model down, the 7735 is a good deal at about half the price; it also has RS-232, although it lacks temperature measurement, the backlight, and sleep mode.

The Tenma 72-7740 DMM in the box.

Thanks to Newark element14 for giving me this digital multimeter free for review. (Newark element14 consists of the merger of the well-known Newark electronics distributor and the element14 online electronics community into a single global brand.)

↧

Teardown and exploration of Apple's Magsafe connector

June 2, 2013, 9:13 am

≫ Next: The Mili universal car/wall USB charger, tested in the lab

≪ Previous: Tenma 72-7740 multimeter: review and teardown

Have you ever wondered what's inside a Mac's Magsafe connector? What controls the light? How does the Mac know what kind of charger it is? This article looks inside the Magsafe connector and answers those questions.

The Magsafe connector (introduced by Apple in 2006) is very convenient. It snaps on magnetically and disconnects if you pull on it. In addition it is symmetrical so you don't need to worry about what side is up. A small LED on the connector changes color to indicate the charging status.

The picture below shows the newer Magsafe 2 connector, which is slimmer. Note how the pins are arranged symmetrically; this allows the connector to be plugged in with either side on top. The charger and computer communicate through the adapter sense pin (also called the charge control pin), which this article will explain in detail below.

The pins of a Magsafe 2 connector. The pins are arranged symmetrically, so the connector can be plugged in either way.

Magsafe connector teardown

I had a Magsafe cable that malfunctioned, burning the power pins as you can see in the photo below, so I figured I'd tear it down and see what's inside. The connector below is an older Magsafe; notice the slightly different shape compared to the Magsafe 2 above. Also note that the middle adapter sense pin is much smaller than the pins, unlike the Magsafe 2.

Removing the outer plastic shell reveals a block of soft waxy plastic, maybe polyethylene, that helps diffuse the light from the LEDs and protects the circuit underneath.

Cutting through the soft plastic block reveals a circuit board, protected by a thin clear plastic coating. The charger wires are soldered onto the back of this board. Only two wires - power and ground - go to the charger unit. There is no data communication via the adapter sense pin with the charger unit itself.

Disassembling the connector shows the spring-loaded "Pogo pins" that form the physical connection to the Mac. The plastic pieces hold the pins in place. The block of metal on the left is not magnetized, but is attracted by the strong magnet in the Mac's connector.

The circuit board inside the Magsafe connector is very small, as you can see below. In the middle are two LEDs, orange/red and green. Two identical LEDs are on the other side. The tiny chip on the left is a DS2413 1-Wire Dual Channel Addressable Switch. This chip has two functions. It switches the status LEDs on and off (that's the "dual channel switch" part). It also provides the ID value to the Mac indicating the charger specifications and serial number.

The chip uses the 1-Wire protocol, which is a clever system for connecting low-speed devices through a single wire (plus ground). The 1-Wire system is convenient here since the Mac can communicate with the Magsafe through the single adapter sense pin.

Understanding the charger's ID code

You can easily pull up the charger information on a Mac (Go to "About this Mac", "More Info...", "System Report...", "Power"), but much of the information is puzzling. The wattage and serial number make sense, but what about the ID, Revision, and Family? It turns out that these are part of the 1-Wire protocol used by the chip inside the connector.

Every chip in the 1-Wire family has a unique 64-bit ID that is individually laser-programmed into the chip. In the 1-Wire standard, the 64-bit ID consists of an 8-bit family code identifying the type of 1-Wire device, a 48-bit unique serial number, and an 8-bit non-cryptographic CRC checksum that verifies the ID number is correct. Companies (such as Apple) can customize the ID numbers: the top 12 bits of the serial number are used as a customer ID, the next 12 bits are data specified by the customer, and the remaining 24 bits are the serial number.

With this information, the Mac's AC charger information now makes sense and the diagram below shows how the 64-bit ID maps onto the charger information. The ID field 100 is the customer ID indicating Apple. The wattage and revision are in the 12 bits of customer data (hex 3C is 60 decimal, indicating 60 watts). The Family code BA is the 1-Wire family code for the DS2413 chip. Thus, much of the AC charger information presented by the Mac is actually low-level information about the 1-Wire chip.

The 1-Wire chip inside a Magsafe connector has a 64-bit ID code. This ID maps directly onto the charger properties displayed under 'About this Mac'.

There are a few complications as the diagram below shows. Later chargers use the family code 85 for some reason. This doesn't indicate an 85 watt charger. It also doesn't indicate the family of the 1-Wire device, so it may be an arbitrary number. For Magsafe 2 chargers, the customer ID is 7A1 for a 45 watt charger, 921 for a 60 watt charger, and AA1 for an 85 watt charger. It's strange to use separate customer IDs for the different models. Even stranger, for an 85 watt charger the wattage field in the ID contains 60 (3C hex) not 85, even though 85 watts shows up on the info screen. The Revision is also dropped from the info screen for later chargers.

In a Magsafe 2 connector, the 64-bit ID maps onto the charger properties displayed under 'About this Mac'. For some reason, the 'Customer data' gives a lower wattage.

How to read the ID number

It's very easy to read the ID number from a Magsafe connector using an Arduino board and a single 2K pullup resistor, along with Paul Stoffregen's Arduino 1-Wire library and a simple Arduino program.

The circuit to access a 1-Wire chip from an Arduino is trivial - just a 2K pullup resistor.

Touching the ground wire to an outer ground pin of the Magsafe connector and the data wire to the inner adapter sense pin will let the Arduino immediately read and display the 64-bit ID number. The charger does not need to be plugged in to the wall - and in fact I recommend not plugging it in - since one interesting feature of the 1-Wire protocol is the device can power itself parasitically off the data wire, without a separate power source.

The 64-bit ID can be read out of a Magsafe connector by probing the outer pin with ground, and the middle pin with the 1-Wire data line.

To make things more convenient, the serial number can be displayed on an LCD display. The circuit looks complicated, but it's just a tangle of wires connecting the LCD display. Using a simple program, the 64-bit ID number is displayed on the bottom line of the display. The top line is a legend indicating the components of the code: "cc" CRC check, "id." customer id, "ww" wattage, "r" revision, "serial" serial number, and "ff" family. The number below corresponds to an 85 watt charger (55 hex = 85 decimal).

A 1-Wire ID reader with LCD display. Touching the wires to the contacts of the Magsafe connector displays the ID code on the bottom line of the display. The top line indicates the components of the code: CRC check, customer id, wattage, revision, serial number, and family.

Controlling the Magsafe status light

The Mac controls the status light in the Magsafe connector by sending commands through the adapter sense pin to the 1-Wire DS2413 switch IC to turn the two pairs of LEDs on or off. By sending the appropriate commands to the IC through the adapter sense pin, an Arduino can control the LEDs as desired.

The picture below demonstrates the setup. The same simple resistor circuit as before is used to communicate with the chip, along with a simple Arduino program that sends commands via the 1-Wire protocol. These commands are described in the DS2413 datasheet but should be obvious from the program code.

I used a cable removed from a dead charger for simplicity. The LEDs are normally powered by the charger's voltage, which I simulated with two 9-volt batteries. To hook the Arduino to the connector, this time I used a Mac DC input board that I got on eBay; this is the board in a Mac that the Magsafe connector plugs into. The only purpose of the board here is to give me a safer way to attach the wires than poking at the pins.

The connector contains a pair of orange/red LEDs and a pair of green LEDs, which can be switched on and off independently. When both pairs are lit, the resulting color is yellow. Thus, the connector can display three colors. The Arduino program cycles through the three colors and off, as you can see from the pictures above.

The charger startup process

When the Magsafe connector is plugged into a Mac, a lot more happens than you might expect. I believe the following steps take place:

The charger provides a very low current (about 100 µA) 6 volt signal on the power pins (3 volts for Magsafe 2).
When the Magsafe connector is plugged into the Mac, the Mac applies a resistive load (e.g. 39.41K&ohm;), pulling the power input low to about 1.7 volts.
The charger detects the power input has been pulled low, but not too low. (A short or a significant load will not enable the charger.) After exactly one second, the charger switches to full voltage (14.85 to 20 volts depending on model and wattage). There's a 16-bit microprocessor inside the charger to control this and other charger functions.
The Mac detects the full voltage on the power input and reads the charger ID using the 1-Wire protocol.
If the Mac is happy with the charger ID, it switches the power input to the internal power conversion circuit and starts using the input power. The Mac switches on the appropriate LED on the connector using the 1-Wire protocol.

This process explains why there is a delay of a second after you connect the charger before the light turns on and the computer indicates the battery is charging. It also explains why if you measure the charger output with a voltmeter, you don't find much voltage.

The complex sequence of steps provides more safety than a typical charger. Because the charger is providing extremeley low current at first, there is less risk of shorting something out while attaching the connector. Since the charger waits a full second before powering up, the Magsafe connector is likely to be firmly attached by the time full power is applied. The safety feature are not foolproof, though, as the burnt-up connector I tore apart shows.

Don't try this at home

Warning: I recommend you don't try any of these experiments. 85 watts is enough to do lots of damage: blow out your Mac's DC input board, send flames out of a component, blow fuses, or vaporize PC traces, and that's just the things I've had happen to me. The Mac and charger both have various protection mechanisms, but they won't take care of everything. Poking at your charger while it's plugged in is a high-risk activity.

Reading your charger's ID by probing the pins while it's not plugged in is considerably safer, but I can't guarantee it. If you mess up your charger, computer or Arduino you're on your own.

Conclusions

There's more to the Magsafe charger connector than you might expect. The center pin of the connector - the adapter sense pin - controls a tiny chip that both identifies the charger and controls the status LED. It is part of a complex interaction between the charger and the Mac. Using an Arduino microcontroller, this chip can be accessed and controlled using the 1-Wire protocol. Is this useful? Not really, but hopefully you found it interesting.

↧

The Mili universal car/wall USB charger, tested in the lab

June 20, 2013, 10:54 pm

≫ Next: Twelve tips for using the Rigol DS1052E Oscilloscope

≪ Previous: Teardown and exploration of Apple's Magsafe connector

I received a Mili universal USB charger for review from Mobile Fun. This interesting charger has some features that make it my current favorite travel charger. It runs off both wall power and car accessory power. It comes with swappable plugs for Europe, UK, US, or Australia, and runs on 120 or 240 volts. It has two USB outputs - I thought this was pointless until I discovered how useful it is in car trips if two people can charge at the same time. In addition, one of the ports provides 10 watts for charging tablets (when plugged into AC). The charger also lights up - red indicates charging, and green indicates the devices are charged.

The charger has a few disadvantages. It is a bit expensive with a list price of $49. Measuring about 2 3/4 inches by 2 1/4 inches, it's much larger than Apple's super-compact inch-cube charger - although it has much more functionality. Finally, due to the design, it ends up blocking both outlets when you plug it into the wall.

In the remainder of this article, I test the performance of the charger both in the car and with AC power. To summarize, the power quality is excellent in the car, but has more noise than the average charger when plugged into the wall.

The Mili charger with adapters for different countries.

The label shows that when connected to AC, the charger is rated as 2.1A for output 1 and 1A for output 2; that is, it is designed to power an iPad from output 1 and a phone from output 2. When plugged in to a car accessory outlet, it is only rated to provide 1 amp, so charging a tablet will be slower. In the measurements below, I find that the charger's power exceeds these ratings when plugged into the wall, which is good, but provides a bit less than the expected one amp when plugged into a car output, which may make charging slower.

Label from the Mili charger.

Apple devices can reject "wrong" chargers with the error "Charging is not supported with this accessory"; Apple uses special proprietary voltages on the USB data pins to distinguish different types of chargers (details). I measured these voltages on the Mili charger and verified that it is configured to appear as an Apple 2A charger on ouput 1, and an Apple 1A charger on output 2.

Cars: a hostile electrical environment

You might expect to find 12 volts at your car's accessory outlet, but what comes out can be surprisingly noisy and variable. This voltage will have spikes from the ignition system as well as very large transients due to starting, malfunctions, or jump starting. A car charger must handle this hostile voltage input, and make sure the output to your device is smooth.

Test setup to measure charger performance in a car.

I measured the voltage in my car to see what happens in a real-world environment using the setup illustrated above. The Mili charger is plugged in just to the left of the gear shift. Above it is the USB interface board, which is connected to the oscilloscope on the dash.

Car voltage drops and rises when the car is started (left). Car voltage at idle showing ignition spikes (right).

The oscilloscope trace (yellow) on the left shows the large voltage fluctuations when I started the car. At the very left, with the ignition off, the battery provides about 12.5 volts. The starter pulls the voltage down to 8.88 volts until the engine starts. The voltage gradually rises over 6 seconds, settling around 14 volts.

On the right, zooming in shows that while the car is idling, the accessory output has 1/2 volt spikes every 28 milliseconds, due to the ignition firing. Note the voltage on the left is much noisier with the car running than on battery - the line on the left is thin, and the line on the right is thick.

Performance of the Mili charger in a car

The Mili charger has a plug that folds out from the side for use in a car. While this makes the charger larger than a dedicated wall charger, having a charger that works both in the car and with AC is more convenient than I expected, especially when traveling.

The Mili USB charger with car adapter.

I looked at the output of the Mili charger while starting the car, to see if the large voltage fluctuations shown above affected the charger's output. The Mili output remained steady, which is good. I also didn't see any of the ignition spikes in the output from the Mili charger. This indicates that the Mili charger does a good job of filtering out noise from the automotive environment.

I tested the Mili charger with inputs from 0 to 30 volts. 30 volts may seem excessive, but jump-starts often use 24 volts, and car electrical failures can result in a 120 volt "load dump". Fortunately, the Mili survived 30 volts just fine (unlike some other chargers I'm testing). The image below shows that the Mili generates a stable output voltage (horizontal line) for inputs from 7 volts to 30 volts. This is a good thing, showing that the Mili won't overload your phone even if your car is providing too much voltage. As expected, the Mili can't produce the full output voltage if the input voltage is too low (left side of the graph).

Output voltage (Y axis) of the Mili charger as the input ranges from 0 to 30 volts (X axis).

The oscilloscope displays below show the output and frequency spectrum with 12V DC input and a 5W load. The power quality is very good - the yellow line is thin and has very few spikes. The high frequency spectrum (orange) shows a spike at the switching frequency, but overall the power quality is among the best of chargers I've looked at.

High frequency spectrum (left) and Low frequency spectrum (right) of the Mili charger on 12V input.

Next, I measured the voltage the charger can provide under increasing load (details). The horizontal line shows the voltage drops from about 5 volts to 4.5 volts as the load increases. The vertical line shows the charger maxes out around .9 amps with less than the expected 5 volts. This is slightly less than the rated 1 amp the charger is supposed to provide. Both USB outputs provide the same current when plugged into a car outlet.

Voltage vs Current for the Mili charger with 12V input.

Charger performance with wall input

I also examined the performance of the Mili charger when plugged into the wall (120V AC). One minor annoyance with using the Mili as a wall charger is that due to the position of the USB ports, both wall outlets are blocked either by the charger or USB cables.

The Mili charger.

The images below show the voltage the charger can provide under increasing load (details). When plugged into the wall, the two USB outputs provide different maximum currents, unlike when plugged into a car outlet. Output 1 (the high current output) is on the left, and output 2 (the low current output) is on the right. Output 1 reaches about 2.45A before the voltage starts dropping, well above the 2.1A rating. The line for output 1 gets fairly wide above 1A, showing the voltage is not too stable. The line also slopes downwards to the right, indicating the voltage drops somewhat as the load increases. Output 2 reaches about 1.1A before the light starts flashing and the power drops and climbs (the curved lines). This graph shows strange behavior under overload that I haven't seen in other chargers. The lines are all fairly wide, showing the voltage is

Voltage vs current for the Mili charger (output 1 left, output 2 right) with 120V AC input.

I looked at the voltage output along with the high frequency and low frequency spectrums (below), to examine the quality of the power outputs. The yellow line is much wider than when plugged into the car outlet, showing a lot more noise in the output. The large orange spike in the middle of the high frequency spectrum shows that a lot of the charger's switching noise is appearing on the output. Compared to other chargers, the power quality is lower than average. On the positive side, the flat low-frequency spectrum shows the charger is very good at eliminating ripple due to the 60 Hz power lines.

High frequency (left) and low frequency (right) spectrum of the Mili charger with 120V AC input.

Conclusions

The Mili charger is convenient for travel because it has plugs for multiple countries, works as an auto charger, and has dual outputs. The power quality is very good in the car, but not so good with AC power. This charger is my favorite charger now - while I'd like to tear it apart and examine the circuit inside, I like it too much to destroy it. Hopefully if you get one you'll like it too. And if you found this interesting, check out my detailed analysis of a dozen chargers in the lab.

Thanks to Mili, Mobile Fun, and Mihnea for providing me with the charger and patiently waiting for the review.

↧

Twelve tips for using the Rigol DS1052E Oscilloscope

July 5, 2013, 10:30 am

≫ Next: Reverse-engineering the flag circuits in the 8085 processor

≪ Previous: The Mili universal car/wall USB charger, tested in the lab

In this article I share a few tips I've learned about using the Rigol DS1052E oscilloscope.

The Rigol DS1052E digital oscilloscope.

Push the knobs

The knobs all have convenient actions if you push them: pushing Vertical Position or Horizontal Position centers the trace vertically or horizontally. Pushing Tigger Level sets it to zero. Pushing Scale sets it to fine adjust mode.

Long Memory

If you don't use Long Memory, you're wasting most of the capacity of the oscilloscope. Long Memory stores 64 times as much data, so you can really zoom in on the waveform. To enable Long Memory, push the Acquire menu button, then select MemDepth to set Long Mem. There's additional documentation here.

The Long Memory depth option of the Rigol DS1052E oscilloscope.

Use zoom

Once you've recorded a waveform, you can pan across it using the horizontal position knob - the waveform window indicator at the very top of the screen shows where you are. In mid-range settings, however, the pan range is fairly limited (about a factor of 5) compared to how deep you can zoom with the horizontal scale knob (about a factor of 1000 with Long Memory). Note: zoom works best with Single triggering; if you use Auto or Norm triggering and hit Run/Stop, sometimes the detailed data isn't in memory and zoom doesn't show more than is on the display.

Pushing the Scale knob turns on the cool zoom mode, which lets you see the trace and a zoomed-in version at the same time, letting you zoom and pan.

The zoom feature of the Rigol DS1052E oscilloscope.

Using the menus

Most of the menu buttons are in the group of 6 at the top. However, there's also a trigger menu button under the trigger knob and a time base menu under the horizontal position knob. This is in addition to the four vertical menu buttons: CH1, CH2, MATH, and REF.

The menus hide about 1/6 of the display, so close the menu when you're done: push the round Menu On/Off button or push a menu button a second time.

Don't press Auto

The Auto button is right next to the Run/Stop button, so you might think it will set the trigger to Auto Sweep. Instead this button sets the controls to seemingly-random values to aotomatically display your traces. This is good if you're totally lost, but more likely to wipe out the settings you want.

Screenshots

Some oscilloscopes make screenshots easy, but the Rigol is more complicated. To take a screenshot on the Rigol, plug a USB drive into the front panel, then hit the Storage menu button, select Bit map under Storage, select External, New File, and Save. This will save NewFile0.bmp to your flash drive. (It's much easier to rename the file on your computer than on the oscilloscope.)

An alternative is to run the slightly clunky UltraScope software on your computer, which gives you access to the oscilloscope via USB. You can download "UltraScope for DS1000E" from the Rigol Software Applications page; although it has a PDF icon, it's actually a Zip file with the software.

Built-in help

If you hold down a button or knob for three seconds, the oscilloscope displays a help screen explaining its action. (I was surprised when I discovered this by accident.)

The built-in help feature of the Rigol DS1052E oscilloscope is triggered by holding down a button or knob.

Triggering

The three trigger sweep modes are Auto, Normal, and Single. Auto will keep displaying traces until you hit Run/Stop. Normal will display a trace every time the trigger condition is satisfied. Single will display a single trace when triggered and then stop. Auto is the way to see a waveform without worrying about triggering. But if you want a nice, stable waveform, set up the trigger and use Normal. Also make sure you're triggering from the right channel - the oscilloscope likes to default to using Channel 2 as the trigger.

Controlling the channels

If you've used an oscilloscope with separate controls for each channel, you may expect the knobs near CH1 to control channel 1, and the knobs near CH2 to control channel 2. Instead, if you hit CH1, the knobs control channel 1's scale and position, while if you hit CH2, Math, or Ref, the same knobs control that channel's scale and position. Make sure you're controlling the trace you think you're controlling.

Use the colored probe rings

Maybe this is too obvious to mention, but putting matching colored rings on both ends of the oscilloscope probes lets you easily tell which probe goes with which channel.

Label oscilloscope probes with colored rings that match the trace colors.

Cursors

The cursors are very handy to measure voltages, times between two points, frequency, etc. (The Measure mode provides lots of automated measurements, but often doesn't measure what you want.) Manual mode lets you position two cursors (either vertical or horizontal), and the positions and difference are displayed. Track lets you position a cursor along the waveform, and a voltage cursor automatically tracks the waveform. Both time and voltage values are displayed. Auto mode is the mode you should use with Measure, in order to see what the measurements mean.

A tracking cursor puts X-Y lines on the waveform and gives measurements.

Finding the manual

Search for DS1000E (not DS1052E) to find the user's guide and other documentation.

Conclusions

I'm glad I bought the Rigol DS1052E - it performs very well for a low-price ($329) oscilloscope. (If money is no object, there's Agilent's $439,000 Infiniium oscilloscope. :-) I hope you find these tips useful. If you have any additional oscilloscope tips, please leave a comment.

↧

Reverse-engineering the flag circuits in the 8085 processor

July 16, 2013, 10:13 pm

≫ Next: Four Rigol oscilloscope hacks with Python

≪ Previous: Twelve tips for using the Rigol DS1052E Oscilloscope

Processors all have status flags to keep track of conditions such as a zero value, a carry, or a negative value. Whenever you write a loop or conditional, these flags ultimately are in control. But how are these flags implemented in the chip's silicon? I've reverse-engineered the flag circuits in the 8085 microprocessor and explain what is really going on.

The photograph below is a highly magnified image of the 8085's silicon, showing the relevant parts of the chip. In the upper-left, the arithmetic logic unit (ALU) performs 8-bit arithmetic operations. The status flag circuitry is below the ALU and the flags are connected to the data bus (indicated in blue). To the right of the ALU, the control PLA decodes the instructions into control lines that control the operations of the ALU and flag circuits.

The 8085 has seven status flags.

Bit 7 is the sign flag, indicating a negative two's-complement value, which is simply a byte with the top bit set.
Bit 6 is the zero flag, indicating a value that is all zeros.
Bit 5 is the undocumented K (or X5) flag, indicating either a carry from the 16-bit incrementer/decrementer or the result of a signed comparison. See my article on the undocumented K and V flags.
Bit 4 is the auxiliary carry, indicating a carry out of the 4 low-order bits. This is typically used for BCD (binary-coded decimal) arithmetic.
Bit 3 is unused and set to 0. Interestingly, a fairly large transistor drives the data bus line to 0 when reading the flags, so this unused flag bit doesn't come for free.
Bit 2 is the parity flag, which is set if the result has an even number of 1 bits.
Bit 1 is the undocumented signed overflow flag V (details).
Bit 0 is the carry flag.

The image below zooms in on the flag silicon, showing individual transistors. The large transistors labeled with the flag name drive the flag value onto the data bus. From the data bus, the flag values control the results of conditional jumps, calls, and returns. The complex circuits above these transistors compute and store the flag values.

The schematic below shows the flag circuit that is implemented in the silicon above.

Schematic of the flag storage in the 8085 microprocessor.

Each flag bit has a latch and control lines to write a value to the latch. Most flags are updated by the same arithmetic instructions and controlled by the arith_to_flags control line. The carry flag is affected by additional instructions and has its own control line. The undocumented K and V flags are updated in different circumstances and have their own control lines.

The bus_to_flags control loads the flags from the data bus for the POP PSW instruction, while the flags_to_bus control sends the flag values over the data bus for the PUSH PSW instruction or for conditional branches.

The circuitry to compute most flag values is straightforward. The sign flag is set based on bit 7 of the result. The auxiliary carry flag is set on the carry out of bit 3. The K and V flags are set based on the top two bits (details). The zero flag is normally set from the alu_zero signal that indicates all bits are zero.

The zero flag has support for multi-byte zero: at each step it can AND the existing zero flag with the current ALU zero value, so the zero flag will be set if both bytes are zero. This is only used for the (undocumented) DSUB 16-bit subtract instruction. Strangely, this circuit is also activated for the 16-bit DAD instructions, but the result is not stored in the flag.

If you look at the chip photograph at the top of the article, the flags are arranged in apparently-random order, not in their bit order as you might expect. Presumably the layout used is more efficient. Also notice that the carry flag C is off to the right of the ALU. Because of the complexity of the carry logic, which will be discussed next, the circuitry wouldn't fit under the ALU with the rest of the flag logic.

The carry logic

The schematic below shows the circuit for the carry flag. The logic for carry is more complex than for the other flags because carry is used in a variety of ways.

Schematic of the carry circuitry in the 8085 microprocessor.

The value stored in the carry flag

The top part of the circuit computes carry_result, the value stored in the carry flag. This value has several different meanings depending on the instruction:

For arithmetic operations, the carry flag is loaded with the value generated by the ALU. That is, alu_carry_7 (the high-order carry from bit 7 of the ALU) is used. (See Inside the ALU of the 8085 microprocessor for details on how this is computed.)
For DAA (decimal adjust accumulator), the carry flag is set if the high-order digit is >= 10. This value is alu_hi_ge_10, which is selected by the daa control line.
For CMC (complement carry), the carry flag value is complemented. To compute this, the previous carry flag value c_flag is selected by use_carry_flag and complemented by the xor_carry_result control line.
For ARHL/RAR/RRC (rotate right operations), bit 0 of the rotated value goes into the carry. In the circuit, reg_act_0 (the low-order bit in the undocumented ACT (accumulator temp) register) is selected by the alu_shift_right control line.

The xor_carry_result control inverts the carry value in a few cases. For subtraction and comparison, it flips the carry bit to be the borrow bit. For STC (set carry), the xor_carry_result control forces the carry to 1. For AND operations, it forces the carry to 0.

Generating the carry input signal

The middle part of the circuit selects the appropriate carry_in value that is supplied to the ALU.

The first option is to set the carry in to either 0 or 1, by using carry_in_0 and optionally xor_carry_in. This is used for most instructions.

The next option is to use the current carry flag value as an input for additions or subtractions (allowing multi-byte arithmetic). For subtraction, this is inverted to convert borrow to carry; the xor_carry_in control does this.

The final option uses the carry latch to temporarily hold the carry for the undocumented LDHI and LDSI instructions. These instructions add a constant to a 16-bit register pair, so they need to add the carry from of the low-order sum to the high-order byte. The carry latch temporarily holds the carry, and this value is selected by the use_latched_carry control line. You might wonder why not just use the normal carry flag; the LDHI and LDSI instructions are designed to leave the carry flag unchanged, so they need somewhere else to temporarily store the carry. The surprising conclusion that Intel deliberately included circuitry in the 8085 specifically to support these undocumented instructions, and then decided not to support these instructions. (In contrast, the 6502's unsupported instructions are just random consequences of unsupported opcodes.)

Generating the shift_right input signal

Each bit of the ALU has a shift right input. For most of the bits, the input comes from the bit to the left, but the high-order bit uses different inputs depending on the instruction. The bottom circuit in the schematic below generates the shift right input for the ALU. This circuit has two simple options.

Normally the carry flag is fed into shift_right_in. For the ARHL and RAR instructions, this causes the carry flag to go into the high-order bit.
For the RRC and RLC instructions (rotate A left/right), the rotate_carry control selects bit 0 as the shift right input.

Conclusions

By reverse-engineering the 8085, we can see how the flag circuits in the 8085 actually works at the gate and silicon level. One interesting feature is the circuitry to implement undocumented instructions and flags. Another interesting feature is the complexity of the carry flag compared to the other flags.

Footnotes on rotate

I recommend you skip this section, but there are few confusing things about the rotate logic that I wanted to write down.

For some reason the rotate operations are named very strangely in the 8080 and 8085. RRC is the "rotate accumulator right" instruction and RAR is the "rotate accumulator right through carry" instruction. Based on the abbreviations, the names seem reversed. The left rotates RLC and RAL are similar. The Z-80 processor has a similar RRC instruction, but calls it "rotate right circular", making the abbreviation slightly less nonsensical.

Bit 0 of ACT is fed into shift_right_in for both RRC and RLC. However, this input is just ignored for RLC since the rotation is the other direction, so I assume this is just a result of the control logic treating RRC and RLC the same.)

To reduce the control circuitry, the rotate_carry and use_latched_carry control lines are actually the same control line since the instructions that use them don't conflict. In other words, there is just one control line, but it has two distinct functions.

↧

Four Rigol oscilloscope hacks with Python

July 19, 2013, 12:31 am

≫ Next: Reverse-engineering the 8085's ALU and its hidden registers

≪ Previous: Reverse-engineering the flag circuits in the 8085 processor

A Rigol oscilloscope has a USB output, allowing you to control it with a computer and and perform additional processing externally. I was inspired by Cibo Mahto's article Controlling a Rigol oscilloscope using Linux and Python, and came up with some new Python oscilloscope hacks: super-zoomable graphs, generating a spectrogram, analyzing an IR signal, and dumping an oscilloscope trace as a WAV file. The key techniques I illustrate are connecting to the oscilloscope with Windows, accessing a megabyte of data with Long Memory, and performing analysis on the data.

Analyzing the IR signal from a TV remote using an IR sensor and a Rigol DS1052E oscilloscope.

Super-zoomable graphs

One of the nice features of the Rigol is "Long Memory" - instead of downloading the 600-point trace that appears on the screen, you can record and access a high-resolution trace of 1 million points. In this hack, I show how you can display this data with Python, giving you a picture that you can easily zoom into with the mouse.

The following screenshot shows the data collected by hooking the oscilloscope up to an IR sensor. In the above picture, the sensor is the three-pin device below the screen. Since I've developed an IR library for Arduino, my examples focus on IR, but any sort of signal could be used. By enabling Long Memory, we can download not just the data on the screen, but 1 million data points, allowing us to zoom way, way in. The graph below shows what it sent when you press a button on the TV remote - the selected button transmits a code, followed by a periodic repeat signal as long as the button is held down.

The IR signal from a TV remote. The first block is the code, followed by period repeat signals while the button is held down.

But with Long Memory, we can interactively zoom way on the waveform and see the actual structure of the code - long header pulses followed by a sequence of wide and narrow pulses that indicate the particular button. That's not the end of the zooming - we can zoom way in on an edge of a pulse and see the actual rise time of the signal over a few microseconds. You can do some pretty nice zooming when you have a million datapoints to plot.

To use this script, first enable Long Memory by going to Acquire: MemDepth. Next, set the trigger sweep to Single. Capture the desired waveform on the oscilloscope. Then run the Python script to upload the data to your computer, which will display a plot using matplotlib. To zoom, click the "+" icon at the bottom of the screen. This lets you pan back and forth through the data by holding down the left mouse button. You can zoom in and out by holding the right mouse button down and moving the mouse right or left. The magnifying glass icon lets you select a zoom rectangle with the mouse. You can zoom on your oscilloscope too, of course, but using a mouse and having labeled axes can be much more convenient.

A few things to notice about the code. The first few lines get the list of instruments connected to VISA and open the USB instrument (i.e. your oscilloscope). The timeout and chunk size need to be increased from the defaults to download the large amount of data without timeouts.

Next, ask_for_values gets various scale values from the oscilloscope so the axes can be labeled properly. By setting the mode to RAW we download the full dataset, not just what is visible on the screen. We get the raw data from channel 1 with :WAV:DATA? CHAN1. The first 10 bytes are a header and should be discarded. Next, the raw bytes are converted to numeric values with Mahto's formulas. Finally, matplotlib plots the data.

There are a couple "gotchas" with Long Memory. First, it only works reliably if you capture a single trace by setting the trigger sweep to "single". Second, downloading all this data over USB takes 10 seconds or so, which can be inconveniently slow.

Analyze an IR signal

Once we can download a signal from the oscilloscope, we can do more than just plot it - we can process and analyze it. In this hack, I decode the IR signal and print the corresponding hex value. Since it takes 10 seconds to download the signal, this isn't a practical way of using an IR remote for control. The point is to illustrate how you can perform logic analysis on the oscilloscope trace by using Python.

This code shows how the Python script can wait for the oscilloscope to be triggered and enter the STOP state. It also shows how you can use Python to initialize the oscilloscope to a desired configuration. The oscilloscope gets confused if you send too many commands at once, so I put a short delay between the commands.

Generate a spectrogram

Another experiment I did was using Python libraries to generate a spectrogram of a signal recorded by the oscilloscope. I simply hooked a microphone to the oscilloscope, spoke a few words, and used the script below to analyze the signal. The spectrogram shows low frequencies at the bottom, high frequencies at the top, and time progresses left to right. This is basically a FFT swept through time.

A spectrogram generated by matplotlib using data from a Rigol DS1052E oscilloscope.

To use this script, set up the oscilloscope for Long Memory as before, record the signal, and then run the script.

Dump data to a .wav file

You might want to analyze the oscilloscope trace with other tools, such as Audacity. By dumping the oscilloscope data into a WAV file, it can easily be read into other software. Or you can play the data and hear how it sounds.

To use this script, enable Long Memory as described above, capture the signal, and run the script. A file channel1.wav will be created.

How to install the necessary libraries

Before connecting your oscilloscope to your Windows computer, there are several software packages you'll need.

I assume you have Python already installed - I'm using 2.7.3.
Install NI-VISA Run-Time Engine 5.2. This is National Instruments Virtual Instrument Software Architecture, providing an interface to hardware test equipment.
Install PyVISA, the Python interface to VISA.
If you want to run the graphical programs, install Numpy and matplotlib.

You can also use Rigol's UltraScope for DS1000E software, but the included NI_VISA 4.3 software doesn't work with pyVisa - I ended up with VI_WARN_CONFIG_NLOADED errors. If you've already installed Ultrascope, you'll probably need to uninstall and reinstall NI_VISA.

If you're using Linux instead of Windows, see Mehta's article.

How to control and program the oscilloscope

Once the software is installed (below), connect the oscilloscope to the computer's USB port. Use the USB port on the back of the oscilloscope, not the flash drive port on the front panel.

Hopefully the code examples above are clear. First, the Python program must get the list of connected instruments from pyVisa and open the USB instrument, which will have a name like USB0::0x1AB1::0x0588::DS1ED141904883. Once the oscilloscope connection is open, you can use scope.write() to send a command to the oscilloscope, scope.ask() to send a command and read a result string, and scope.ask_for_values() to send a command and read a float back from the oscilloscope.

When the oscilloscope is under computer control, the screen shows Rmt and the front panel is non-responsive. The "Force" button will restore local control. Software can release the oscilloscope by sending the corresponding ":KEY:FORCE" command.

Error handling in pyVisa is minimal. If you send a bad command, it will hang and eventually timeout with VisaIOError: VI_ERROR_TMO: Timeout expired before operation completed.

The API to the oscilloscope is specified in the DS1000D/E Programming Guide. If you do any Rigol hacking, you'll definitely want to read this. Make sure you use the right programming guide for your oscilloscope model - other models have slightly different commands that seem plausible, but they will timeout if you try them.

Conclusions

Connecting an oscilloscope to a computer opens up many opportunities for processing the measurement data, and Python is a convenient language to do this. The Long Memory mode is especially useful, since it provides extremely detailed data samples.

↧

Reverse-engineering the 8085's ALU and its hidden registers

July 19, 2013, 9:07 am

≫ Next: Simulating a TI calculator with crazy 11-bit opcodes

≪ Previous: Four Rigol oscilloscope hacks with Python

This article describes how the ALU of the 8085 microprocessor works and how it interacts with the rest of the chip, based on reverse-engineering of the silicon. (This is part 2 of my ALU reverse-engineering; part 1 described the circuit for a single ALU bit.) Along with the accumulator, the ALU uses two undocumented registers - ACT and TMP - and this article describes how they work in detail, as well as how the ALU is controlled.

The arithmetic-logic unit is a key part of the microprocessor, performing operations and comparisons on data. In the 8085, the ALU is also a key part of the data path for moving data. The ALU and associated registers take up a fairly large part of the chip, the upper left of the photomicrograph image below. The control circuitry for the ALU is in the top center of the image. The data bus (dbus) is indicated in blue.

Photograph of the 8085 chip showing the location of the ALU, flags, and registers.

The real architecture of the 8085 ALU

The following architecture diagram shows how the ALU interacts with the rest of the 8085 at the block-diagram level. The data bus (dbus) conneccts the ALU and associated registers with the rest of the 8085 microprocessor. There are also numerous control lines, which are not shown.

The ALU uses two temporary registers that are not directly visible to the programmer. The Accumulator Temporary register (ACT) holds the accumulator value while an ALU operation is performed. This allows the accumulator to be updated with the new value without causing a race condition. The second temporary register (TMP) holds the other argument for the ALU operation. The TMP register typically holds a value from memory or another register.

Architecture of the 8085 ALU as determined from reverse-engineering.

The 8085 datasheet has an architecture diagram that is simplified and not quite correct. In particular, the ACT register is omitted and a data path from the data bus to the accumulator is shown, even though that path doesn't exist.

The accumulator and ACT registers

To the programmer, the accumulator is the key register for arithmetic operations. Reverse-engineering, however, shows the accumulator is not connected directly to the ALU, but works closely with the ACT (accumulator temporary) register.

The ACT register has several important functions. First, it holds the input to the ALU. This allows the results from the ALU to be written back to the accumulator without disturbing the input, which would cause instability. Second, the ACT can hold constant values (e.g. for incrementing or decrementing, or decimal adjustment) without affecting the accumulator. Finally, the ACT allows ALU operations that don't use the accumulator.

The accumulator and ACT (Accumulator Temporary) registers and their control lines in the 8085 microprocessor.

The diagram above shows how the accumulator and ACT registers are connected, and the control lines that affect them. One surprise is that the only way to put a value into the accumulator is through the ALU. This is controlled by the alu_to_a control line. You might expect that if you load a value into the accumulator, it would go directly from the data bus to the accumulator. Instead, the value is OR'd with 0 in the ALU and the result is stored in the accumulator.

The accumulator has two status outputs: a_hi_ge_10, if the four high-order bits are ≥ 10, and a_lo_ge_10, if the four low-order bits are ≥ 10. These outputs are used for decimal arithmetic, and will be explained in another article.

The accumulator value or the ALU result can be written to the databus through the sel_alu_a control (which selects between the ALU result and the accumulator), and the alu/a_to_dbus control line, which enables the superbuffer to write the value to the data bus. (Because the data bus is large and connects many parts of the chip, it requires high-current signals to overcome its capacitance. A "superbuffer" provides this high-current output.)

The ACT register can hold a variety of different values. In a typical arithmetic operation, the accumulator value is loaded into the ACT via the a_to_act control. The ACT can also load a value from the data bus via dbus_to_act. This is used for the ARHL/DAD/DSUB/LDHI/LDSI/RDEL instructions (all of which are undocumented except DAD). These instructions perform arithmetic operations without involving the accumulator, so they require a path into the ALU that bypasses the accumulator.

The control lines allow the ACT register to be loaded with a variety of constants. The 0/fe_to_act control line loads either 0 or 0xfe into the ACT; the value is selected by the sel_0_fe control line. The value 0 has a variety of uses. ORing a value with 0 allows the value to pass through the ALU unchanged. If the carry is set, ADDing to 0 performs an increment. The value 0xfe (signed -2) is used only for the DCR (decrement by 1) instruction. You might think the value 0xff (signed -1) would be more appropriate, but if the carry is set, ADDing 0xfe decrements by 1. I think the motivation is so both increments and decrements have the carry set, and thus can use the same logic to control the carry.

Since the 8085 has a 16-bit increment/decrement circuit, you might wonder why the ALU is also used for increment/decrement. The main reason is that using the ALU allows the condition flags to be set by INR and DCR. In contrast, the 16-bit increment and decrement instructions (INX and DCX) use the incrementer/decrementer, and as a consequence the flags are not updated.

To support BCD, the ACT can be loaded with decimal adjustment values 0x00, 0x06, 0x60, or 0x66. The top and bottom four bits of ACT are loaded with the value 6 with the 6x_to_act and x6_to_act control lines respectively.

It turns out that the decimal adjustment values are easily visible in the silicon. The following image shows the silicon that implements the ACT register. Each of the large pink structures is one bit. The eight bits are arranged with bit 7 on the left and bit 0 on the right. Note that half of the bits have pink loops at the top, in the pattern 0110 0110. These loops pull the associated bit high, and are used to set the high and/or low four bits to 6 (binary 0110).

The ACT register in the 8085. This image shows the silicon that implements the 8-bit register.

Building the 8-bit ALU from single-bit slices

In my previous article on the 8085 ALU I described how each bit of the ALU is implemented. Each bit slice of the ALU takes two inputs and performs a simple operation: or, add, xor and, shift right, complement, or subtract. The ALU has a shift right input and a carry input, and generates a carry output. In addition, each slice of the ALU contributes to the parity and zero calculations. The ALU has five control lines to select the operation.

One bit of the ALU in the 8085 microprocessor

The ALU has seven basic operations: or, add, xor, and, shift right, complement, and subtract. The following table shows the five control lines that select the operation, and the meaning of the carry line for the operation. Note that the meaning of carry in and carry out is different for each operations. For bit operations, the implementation of the ALU circuitry depends on a particular carry in value, even though carry is meaningless for these operations.

Operation	select_neg_in2	select_op1	select_op2	select_shift_right	select_ncarry_1	Carry in/out
or	0	0	0	0	1	1
add	0	1	0	0	0	/carry
xor	0	1	0	0	1	1
and	0	1	1	0	1	0
shift right	0	0	1	1	1	0
complement	1	0	0	0	1	1
subtract	1	1	0	0	0	borrow

The eight-bit ALU is formed by linking eight single-bit ALUs as shown below. The high-order bit is on the left, and the low-order bit on the right, matching the layout in silicon. The carry, parity, and zero values propagate through each ALU to form the final values on the left. The right shift input is simply the bit from the right, with the exception of the topmost bit which uses a special shift right input. The auxiliary carry is simply the carry out of bit three. The control lines to select the operation are fed into all eight ALU slices. By combining eight of these ALU slices, the whole 8-bit ALU is created. The values from the top bit are used to control the parity, zero, carry, and sign flags (as well as the undocumented K and V flags). Bit 3 generates the half carry flag.

The 8-bit ALU in the 8085 is formed by combining eight 1-bit slices.

The control lines

The ALU uses 29 control lines that are generated by a PLA that activates the right control lines based on the opcode and the position in the instruction cycle. For reference, the following table lists the 29 ALU control lines and the instructions that affect them.

Control line	Relevant instructions
`ad_latch_dbus, write_dbus_to_alu_tmp, /ad_dbus`	`IN/LDA/LHLD`
`/ad_dbus`	`ARHL/DAD/DSUB/LDHI/LDSI/RDEL`
`/alu/a_to_dbus`	all
`/dbus_to_act`	`ARHL/DAD/DSUB/LDHI/LDSI/RDEL`
`a_to_act`	`ACI/ADC/ADD/ADI/ANA/ANI/CMP/CPI/ORA/ORI/RAL/RAR/RLC/RRC/SBB/SBI/SUB/SUI/XRA/XRI`
`0/fe_to_act`	all
`sel_alu_a`	all
`alu_to_a`	`ACI/ADC/ADD/ADI/ANA/ANI/CMA/CMC/DAA/DCR/IN/INR/LDA/LDAX/MOV/MVI/ORA/ORI/POP/RAL/RAR/RIM/RLC/RRC/SBB/SBI/SIM/STC/SUB/SUI/XRA/XRI`
`/daa`	`DAA`
`sel_0_fe`	`DCR`
`store_v_flag`	`ACI/ADC/ADD/ADI/ANA/ANI/ARHL/CMP/CPI/DAA/DCR/INR/ORA/ORI/RAL/RAR/RLC/RRC/SBB/SBI/SUB/SUI/XRA/XRI`
`select_shift_right`	`ARHL/RAR/RRC`
`arith_to_flags`	`ACI/ADC/ADD/ADI/ANA/ANI/CMP/CPI/DAA/DCR/DSUB/INR/ORA/ORI/SBB/SBI/SUB/SUI/XRA/XRI`
`bus_to_flags`	`POP PSW`
`/zero_flag_combine`	`DAD/DSUB`
`/flags_to_bus`	`ACI/ADC/ADD/ADI/ANA/ANI/ARHL/CALL/CC/CM/CMA/CMC/CMP/CNC/CNZ/CP/CPE/CPI//CPO/CZ/DAA/DAD/DCR/DCX/DI/DSUB/EI/HLT/IN/INR/INX/JC/JK/JM/JMP/JNC/JNK/JNZ/JP/JPE/JPO/JZ/LDA/LDAX/LDHI/LDSI/LHLD/LHLX/LXI/MOV/MVI/NOP/ORA/ORI/OUT/PCHL/POP/PUSH/RAL/RAR/RC/RDEL/RET/RIM/RLC/RM/RNC/RNZ/RP/RPE/RPO/RRC/RST/RSTV/RZ/SBB/SBI/SHLD/SHLX/SIM/SPHL/STA/STAX/STC/SUB/SUI/XCHG/XRA/XRI/XTHL`
`shift_right_in_select`	`ARHL`
`xor_carry_in`	`ANA/ANI/ARHL/CMP/CPI/DCR/DSUB/INR/RAR/RRC/SBB/SBI/SUB/SUI`
`select_op2`	`ANA/ANI/ARHL/RAR/RRC`
`/use_latched_carry /rotate_carry`	`LDHI/LDSI/RLC/RRC`
`/carry_in_0`	0 except for `ACI/ADC/DAD/DSUB/LDHI/LDSI/RAL/RDEL/RLC/SBB/SBI`
`select_op1`	`ACI/ADC/ADD/ADI/ANA/ANI/CMP/CPI/DAA/DAD/DCR/DSUB/INR/LDHI/LDSI/RAL/RDEL/RLC/SBB/SBI/SUB/SUI/XRA/XRI`
`select_ncarry_1`	`ACI/ADC/ADD/ADI/CMP/CPI/DAA/DAD/DCR/DSUB/INR/LDHI/LDSI/RAL/RDEL/RLC/SBB/SBI/SUB/SUI`
In combination with first control line, `write_dbus_to_alu_tmp`	`ADC/ADD/ANA/CMA/CMC/CMP/DAA/DCR/INR/MOV/ORA/RAL/RAR/RIM/RLC/RRC/SBB/SIM/STC/SUB/XRA`
`select_neg_in2`	`CMA/CMP/CPI/DSUB/SBB/SBI/SUB/SUI`
`carry_to_k_flag`	`DCX/INX`
`store_carry_flag`	`ACI/ADC/ADD/ADI/ANA/ANI/ARHL/CMC/CMP/CPI/DAA/DAD/DSUB/ORA/ORI/RAL/RAR/RDEL/RLC/RRC/SBB/SBI/STC/SUB/SUI/XRA/XRI`
`xor_carry_result`	xor for `ANA/ANI/CMC/CMP/CPI/DSUB/SBB/SBI/STC/SUB/SUI`
`/latch_carry use_carry_flag`	`CMC/LDHI/LDSI`

Conclusions

By reverse-engineering the 8085, we can see how the ALU actually works at the gate and silicon level. The ALU uses many standard techniques, but there are also some surprises and tricks. There are two registers (ACT and TMP) that are invisible to the programmer. You'd expect a direct path from the data bus to the accumulator, but instead the data passes through the ALU. The increment/decrement logic uses the unexpected constant 0xfe, and there are two totally different ways of performing increment/decrement. Several undocumented instructions perform ALU operations without involving the accumulator at all.

This information builds on the 8085 reverse-engineering done by the visual 6502 team. This team dissolves chips in acid to remove the packaging and then takes many close-up photographs of the die inside. Pavel Zima converted these photographs into mask layer images, generated a transistor net from the layers, and wrote a transistor-level 8085 simulator.

↧

Simulating a TI calculator with crazy 11-bit opcodes

August 11, 2013, 10:50 am

≫ Next: Hippies, clever hardware and Steve Jobs' body odor: Visiting Apple in 1976

≪ Previous: Reverse-engineering the 8085's ALU and its hidden registers

I've built a register-level simulator of a 1974 TI calculator chip that shows what actually happens inside a calculator when you perform operations and shows the calculator source code as it executes. The architecture of the calculator chip is pretty interesting, with 11-bit opcodes, a 9-bit address bus, and 44-bit BCD registers. The chip doesn't support multiplication or division, so these are performed with repeated addition or subtraction.

The simulator is at righto.com/ti.

↧