I think I have solved the problem and found the root cause of many people's problems. There were two root causes:
a. First was the use of the open collector gates in U18 and U19 and the associated low resistance pull ups. The CF spec was clear that the signals driven by those gates did not need high current and in fact did not need high voltage for the CF to recognize the high logic state. The solution I came up with for this issue has been validated on my system but with an open issue we will discuss in a minute. SOLUTION: Remove RR1 and replace U18 and U19 with 74LS00's or 74ALS00's. I am now using 74ALS00's but think I can now go back to 74LS00's.
b. The second root cause was more difficult to identify but there were hints. Many, including myself, were saying the board seemed to work fine when using MYIDE but would not then work in a system environment. OK, then ask yourself what is different about those two modes of operation? The most basic difference is that under MYIDE you were basically doing operations on one sector (one LBA sector) at a time with a long idle period between operations while you decided what to do next and then while you entered data from the keyboard. There are multi-sector operations in MYIDE and sometime they did not work right. I returned to the IDE and CF specs and there was one requirement that MYIDE did not follow. The specs state that before writing to a register (i.e., the "task file" for a read or write) you had to make sure the device was not busy and then before actually issuing a command (for us normally either the read or write sector command) you needed to ensure the device was ready. These are two different tasks that need to be performed at two different places in the code. MYIDE did not check for a device BUSY (bit 7 in the STATUS byte) before writing to the registers in WRLBA. MYIDE did check for READY (bit 6 in the STATUS byte) before executing the read or write command. SOLUTION: Split IDEWAITNOTBUSY into two subroutines. One called that that just checked the BUSY bit and then add a call to that subroutine in WRLBA just before the calculated LBA is written to the device registers. The second subroutine is called IDEWAITREADY that just checks the READY bit in status and change the existing calls in the read and write sector subroutines to point to IDEWAITREADY.
Why were some seeing this issue and some not? If your system were a little slower it might not be evident. Even a MS-DOS or other nominally higher speed system might not see the issue depending on overhead between sector operations. But when it happened that is why data would be written to the wrong sector. There was no error message from the CF card because it had no way of knowing it was using the wrong address.
My BIOS/MONITOR code was totally based on MYIDE as is probably the case for most of us. Once I changed my code I was able to do bulk transfers including user area to user area copies without issues. My system is stable. I will keep it up and running as is for few days but I think I have a solution.
Some of what I did to get to this point I will try to undo but only after I get through the next few days. This is especially true for the hardware associated with the hex displays and the LEDs. They may work just fine but I have not tried that yet.
The other change I made was to replace the gates (74LS00, 74LS02, 74ALS32) with the 74ALS versions. I did not change the OC gates or any of the buffers like the 74LS244's. I suspect I can go back to the LS versions but I have not verified that yet either.
I have attached my MONITOR source code that has the IDE code I talked about above. For completeness I have also attached the BIOS but there is no IDE code there. If you have questions let me know.
Rich Leary