Let's consider just one cycle of the Szilard engine. Aside from discussion of energy free state polling, one of the main points of Bennett paper (if you mean Charles Bennett, "The Thermodynamics of Computation: A Review", Int. J. Theo. Phys., 21, No. 12, 1982) here is that you must build a finite state machine (a very simple three-state machine) as a minimal Maxwell Daemon. Whichever way you do it, you must implement storage for this state machine in some kind of physical, computer memory. Once you come back to the beginning again, you must either (1) use a new bit in memory for the next cycle, as in your drawings or (2) initialise the bit to use again, i.e. "forget" its former state. This "forgetting" is the key to the "mystery" of the decreasing information-theoretic entropy.
The laws of physics at the microscopic scale are perfectly reversible. That means that there is an invertible (indeed unitary) mapping between the microstate (full quantum state) of any physical system at any time and its state at any other time. If you have a system's full state definition at any time, you can derive from this the state at any other time - past or future. The World does not forget its history.
So, when you wipe the bit in the Maxwell Daemon, ready to begin a new cycle, ask yourself how this wiping can be in keeping with my last paragraph. How so? My last paragraph asserts this: the physical process of wiping must be invertible, in principle. This can only mean one thing: the process of wiping the bit must change the states of the "stuff" that makes up the computer subtly: you could in principle run a simulation of this whole process backwards, beginning with a full specification of its state after the deletion, and you would see this state change in the computer hardware's matter unwinding and restoring the wiped bit!
Therefore, as the Maxwell Daemon runs, the whole bit sequence, recording all the states of all the gas molecules in each of the cycles, must somehow wind up encoded in the changed state of the computer hardware's matter.
Repeated bit erasures change the state of the computer hardware's matter more and more
This is OK for a while. The Maxwell Daemon seems to win. But all finite physical systems have a finite information storage capacity. Look up the Bekenstein Bound, for example. In the end, the matter can encode no more cycle bit states, and the machine must stop. Or, another alternative: one can raise physical system's information storage capacity by making it hotter. So you would have to give the computer system's matter this extra capacity by thermalising it. That energy has to come from somewhere. Or, yet another alternative: we must do work on the system's matter to make happen other physical processes to encode the system's matter's physical state elsewhere in the Universe. In the environment of the computer. Later on, we shall need to do the same to the room that the computer lives in. This particular work is often done by air conditioners! (I jest a little here: most of the energy used by our computers is "inefficient": our computers use roughly ten orders of magnitude more energy than the Landauer limit, i.e. the work needed to erase and initialise memory we have just talked about.
As powerful as they undoubtedly are for thinking about statistical mechanics, abstract information-theoretic methods can beguile us into forgetting this one simple fact:
In physics, you can't disembody physical system information from its underlying physical system and think about it purely abstractly as we often do in pure information theory and, in particular, computer science. Nature writes down Her information, Her "bits" in physical "ink", so to speak. That "ink" is the state of a physical system
You might like to see my articles "Information is Physical: Landauer's Principle and the Information Soaking Capacity of Physical Systems" and
"Free Energies: What does a physical chemist mean when he/she talks of needing work to throw excess entropy out of a reaction?"