I can do data preprocessing in 3-4 hours as you have mentioned, but I would need more information about the domain where you want to use the data. And in what Machine Learning Algorithm, you want to use the data. This will allow me to choose the correct data preprocessing method. Let me know, what works for you.