Word is nothing but what a processor can process at an instance. As you told that for all x86 processor family we have
1 word = 2bytes(x86 16 bit system)
1 word = 4bytes(32 bit system)
1 word = 8bytes(64 bit system)
This variation is because 16 bits system can process 16 bits = 2 bytes at a time. Similarly for 32 bits system it is 4 bytes.
So the core thing is
1 word = number of bytes that can be processed by a processor at an instance.