'Bag' is another word for MultiSet, i.e. a set which can contain more than one instance of a given thing.
In Smalltalk, a Bag is implemented as an unordered collection of keyed objects, likely using some kind of hash. It holds any number of objects, in arbitrary order, but optimised for looking them up according to some key value, and allows multiple objects that share a single key. The typical operations on a bag are "insert this object", "remove this object", and "give me a list of all the objects with this key".
A Java Map/HashMap is not a bag. A Map in java maps one key to one value.
RdfLanguage also uses "bag" to refer to an unordered collection.
It's actually difficult to implement a true bag. Apps usually need a way to reference dynamically-allocated things/objects, which requires a unique identifier. In practice this usually means it's really either a map or a list. Such an API will usually do one of 3 things:
--top
Your argument leads to a conclusion that "true bags are not useful for most applications" as opposed to your thesis that "true bags are difficult to implement". I agree. Bags are mostly useless, in practice.
The two issues are related because one usually needs a way to identify something to "retrieve" it. Perhaps I need to reword some of it.
Efficient retrieval of a specific element is not part of the `bag` abstraction. You can always remove elements one at a time until you find the one you're looking for.
The one's that I'm familiar with either have an extractable internal "record ID" of some sort, and/or preserve order for non-keyed tables. They behave more like lists.
Which databases are you familiar with? To what extent do you use sharding and related features?
No, I have not tested their inherent ordering characteristics under sharding.
See also the Dictionary of Algorithms and Data Structures: http://www.nist.gov/dads/HTML/bag.html --ChristofferHammarstrom
See OrderedBag