December 13, 2022
My wife is psychopath who has logged her entire wardrobe in some app called “smart closet.” This digitization has been a labor of love carried out over nearly six years. Being aphantasic, “smart” closet was a critical tool for her to plan outfits on the daily – it was especially useful for planning what to pack on trips. Unfortunately, the “smart” closet developers have not been good stewards of her data – in a recent update, she temporarily lost everything. Soon after, she got a decent portion of her data back, but seems to have permanently lost all entries from the last year (i.e. since the last app update).
Now seems to be a fitting time for us to build what I affectionately call “Smarter Closet App™️”: A Notion template that replicates 90% of the good features of “smart” closet (with none of the bad, profit-seeking ones). As a platform, Notion doesn’t seem to be going anyway anytime soon. In the case that they do, at least they give users the ability to take their data with them.
While @Camille Merose iterates on the development of this template (which she may or may not choose to share), the first task at hand is to recover as much data as possible. Most of the metadata in the app, she explained to me, is not that hard to re-enter. The main time-sink is having to manually download all the images she painstakingly curated.
To the credit of the “smart” closet developers: they did create an excellent interface for taking pictures of your clothes and removing the background of the images. If Camille could avoid having to do this again for even a decent portion of her closet, that would be a huge amount of time saved.
To that end, here is a quick script to scrape all the images from their app. To use it, go to https://smartcloset.me/closet, right click anywhere in the page and click “inspect”, find the area in the panel that says “console”, and paste the following code:
// Step 1: Scroll to the bottom of the page.
// Taken from: <https://javascript.plainenglish.io/how-to-get-to-the-end-of-a-page-with-infinite-scrolling-%EF%B8%8F-4c10c3ab4b89>
const MAXIMUM_NUMBER_OF_TRIALS = 3;
const MINIMUM_SLEEPING_TIME_IN_MS = 2000;
const MAXIMUM_SLEEPING_TIME_IN_MS = 3000;
const sleep = (time) => new Promise((resolve) => setTimeout(resolve, time));
const randomNumber = (minimum, maximum) => Math.floor(Math.random() * maximum) + minimum;
const randomSleep = () => sleep(randomNumber(MINIMUM_SLEEPING_TIME_IN_MS, MAXIMUM_SLEEPING_TIME_IN_MS));
let currentScrollHeight = 0;
let manualStop = false;
let numberOfScrolls = 0;
let numberOfTrials = 0;
while (numberOfTrials < MAXIMUM_NUMBER_OF_TRIALS && !manualStop) {
currentScrollHeight = document.body.scrollHeight;
window.scrollTo(0, currentScrollHeight);
await randomSleep();
if (currentScrollHeight === document.body.scrollHeight) {
numberOfTrials++;
console.log(
`Is it already the end of the infinite scroll? ${MAXIMUM_NUMBER_OF_TRIALS - numberOfTrials} trials left.`,
);
} else {
numberOfTrials = 0;
numberOfScrolls++;
console.log(`The scroll #${numberOfScrolls} was successful!`);
}
}
/**
* Download all images to desktop with names based on the URL path.
*
* Adapted from <https://dev.to/sbodi10/download-images-using-javascript-51a9>.
* @param imageSrc
* @returns {Promise<void>}
*/
async function downloadImage(imageSrc) {
await sleep(randomNumber(2_000, 3_000));
const image = await fetch(imageSrc)
const imageBlog = await image.blob()
const imageURL = URL.createObjectURL(imageBlog)
const link = document.createElement('a')
const name = new URL(imageSrc).pathname.replaceAll('/', '-');
link.href = imageURL
link.download = name
document.body.appendChild(link)
link.click()
document.body.removeChild(link)
}
console.log('attempting download...')
// Step 2: Get all the images on the page
let imgs = document.querySelectorAll('img');
// Step 3: Download all images one-by-one, waiting 2-3 seconds
// per image so the app doesn't block any request.
for (const it of Array.from(imgs)) {
await downloadImage(it.src);
}
console.log('done.')
This script will take ~3 seconds per image to finish (to avoid the app from blocking the downloads). This took me ~15 minutes of running in the background to get all of my wife’s images.
The same script works on the lookbook page, too.