Project time to completion

Solarion

Honorary Master
Joined
Nov 14, 2012
Messages
23,458
Hey guys. I got a question for you, especially any of you who have experience in image processing.

Say for example you have an image of a landscape, and you split it up into 1000 equal parts. The entire image has a narrow, black border. The images are not rotated, but are sliced up as is.

The task is to re-assemble this image in code, either with C#, Python or other means.

How long would a project like this take and would it require a fair amount of knowledge with image processing in software development?
 
No. We are not there yet with AI. I've tried and boy have I tried.

Apart from AI I ended up going down the OpenCV rabbit hole. Ended up getting lost in Wonderland a few times.
This honestly sounds like quite a simple prompt for ChatGPT although I can't quite understand what exactly you're trying to achieve.

The following prompt in ChatGPT gave me a decent looking result:

Write a pair of python functions utilizing opencv, one function to split an input image up into 1000 identical segments and another function which can re-assemble these images.


Code:
import cv2
import numpy as np

def split_image(image_path, output_dir):
    # Load the input image
    image = cv2.imread(image_path)
   
    # Get image dimensions
    height, width, _ = image.shape
   
    # Calculate number of segments
    num_segments = 1000
   
    # Calculate segment size
    segment_height = height // int(np.sqrt(num_segments))
    segment_width = width // int(np.sqrt(num_segments))
   
    # Initialize counter for file naming
    count = 0
   
    # Loop through rows and columns to split the image
    for i in range(0, height, segment_height):
        for j in range(0, width, segment_width):
            # Define the region of interest
            roi = image[i:i+segment_height, j:j+segment_width]
           
            # Save the segment as an individual image
            segment_path = f"{output_dir}/segment_{count}.png"
            cv2.imwrite(segment_path, roi)
           
            count += 1

def reassemble_image(input_dir, output_path):
    # Load all segments from the input directory
    segments = []
    for filename in os.listdir(input_dir):
        if filename.endswith(".png"):
            segment_path = os.path.join(input_dir, filename)
            segment = cv2.imread(segment_path)
            segments.append(segment)
   
    # Combine segments into a single image
    rows = int(np.sqrt(len(segments)))
    cols = rows
    segment_height, segment_width, _ = segments[0].shape
    image = np.zeros((rows * segment_height, cols * segment_width, 3), dtype=np.uint8)
   
    for i in range(rows):
        for j in range(cols):
            segment_index = i * rows + j
            image[i*segment_height:(i+1)*segment_height, j*segment_width:(j+1)*segment_width] = segments[segment_index]
   
    # Save the reassembled image
    cv2.imwrite(output_path, image)
 
This honestly sounds like quite a simple prompt for ChatGPT although I can't quite understand what exactly you're trying to achieve.

The following prompt in ChatGPT gave me a decent looking result:




Code:
import cv2
import numpy as np

def split_image(image_path, output_dir):
    # Load the input image
    image = cv2.imread(image_path)
  
    # Get image dimensions
    height, width, _ = image.shape
  
    # Calculate number of segments
    num_segments = 1000
  
    # Calculate segment size
    segment_height = height // int(np.sqrt(num_segments))
    segment_width = width // int(np.sqrt(num_segments))
  
    # Initialize counter for file naming
    count = 0
  
    # Loop through rows and columns to split the image
    for i in range(0, height, segment_height):
        for j in range(0, width, segment_width):
            # Define the region of interest
            roi = image[i:i+segment_height, j:j+segment_width]
          
            # Save the segment as an individual image
            segment_path = f"{output_dir}/segment_{count}.png"
            cv2.imwrite(segment_path, roi)
          
            count += 1

def reassemble_image(input_dir, output_path):
    # Load all segments from the input directory
    segments = []
    for filename in os.listdir(input_dir):
        if filename.endswith(".png"):
            segment_path = os.path.join(input_dir, filename)
            segment = cv2.imread(segment_path)
            segments.append(segment)
  
    # Combine segments into a single image
    rows = int(np.sqrt(len(segments)))
    cols = rows
    segment_height, segment_width, _ = segments[0].shape
    image = np.zeros((rows * segment_height, cols * segment_width, 3), dtype=np.uint8)
  
    for i in range(rows):
        for j in range(cols):
            segment_index = i * rows + j
            image[i*segment_height:(i+1)*segment_height, j*segment_width:(j+1)*segment_width] = segments[segment_index]
  
    # Save the reassembled image
    cv2.imwrite(output_path, image)

One catch. This approach uses the filenames to determine the order in which the image segments are placed back together. Now try this with filenames that are completely randomized aka the segments do not match their image location after being split up.
 
I think that your problem statement is missing the most important information: Is the order/position of the sub-tiles known or not? If yes, it should be fairly trivial (as @DrJohnZoidberg shows). If not, then the problem becomes a lot more complex.

I am assuming it's the latter, otherwise the problem is fairly easy. In this case, one would have to assemble the image using some sort of edge similarity measure.

The simple algorithm: Pick a random tile to start with. Then find the tile that best matches one of those edges (look for something like the some of squared differences between adjoining edge pixels as the measure). Find the next best fitting tile, etc. There is the possibility of not fitting all tiles of course.

To improve on the above, one can look for best edge-pairs: To start with, find the best pair of matching edges, and glue those tiles together into larger pieces. Then look for best match between all added edges and all unadded edges. If the new unadded tile touches more than one edge of the added tiles, then apply the fit metric to all touching edges.

Additional thoughts: You may want to apply an information criterion to the edges as well. So the best candidate would be something like: score for closest match plus score for more information. Where over here, information would be the diversity of pixel colours on the edge (e.g., sqrt of sum of squared adjacent pixel deltas along the same edge). This avoids picking edges that are for example just one colour and matching them to each other as good fits. You could also look more than one pixel edge in (so one layer into the border of the tile), and use a similar edge metric, or even a gradient based edge difference metric (do the pixels deltas follow a gradient? possibly even non-horizontal or non-vertical).

I expect that for most images, you would get quite far with the above, although with (a pre-trained) AI model - you could probably use it score the entire image by level of "shuffled" or "non-shuffled" prompts, allowing you to differentiate between hard to discern tile placements.

Also, one could train a model that tells you whether or not a tile fits into surrounding tiles. This will likely do something to the edge metrics above implicitly, but may do better, since it is more likely to contextualize the contents of the tiles properly. It's trivial to get the data to train such a model, since all you need are images, that you tile up and feed in.

EDIT: I see it is the latter case - good. :) It took me a few mins to write the above.
 
Last edited:
The solution may also be driven by an understanding of the use case. But..
Seeing that the image has a border, finding the tiles that make up the respective borders and corners will be easy, seeing that the tiles are not rotated. Then use @cguy ideas above to best arrange the border tiles.
OR
Here is where the use case comes into play. If this was a large image where each tile was processed separately, but you have a reduced resolution version of the original, things become easier. You know the size of the tiles so you can slice the low res original into the same tiles. Now you have smaller lower res versions of each tile but you know where they fit. By comparing features of a low res tile with that of the large ones, you can determine best match. The tile features you can compare include any meaningful info you can calc for a tile e.g. entropy, avg RGB for tile subsections, etc. You can also use the above ideas on top of this.
 
Thank you for the amazing feedback guys. It sounds like quite an involved project this. I gather this is not something one can just put together in a couple of hours? Possible a few days more like it.
 
Thank you for the amazing feedback guys. It sounds like quite an involved project this. I gather this is not something one can just put together in a couple of hours? Possible a few days more like it.
If the images are indeed landscapes, or images with similar properties then the simplest solution I outlined above would likely work pretty well.

If it’s general images, it’s much harder. Imagine a face on a black background. So many tile edges are just black, so how does one match that up using a greedy algorithm? What if there is a sharp boundary in the image that aligned exactly with a vertical or horizontal edge (very unlikely in a landscape, but much more likely in a city scale). So you need something more complex.
 
Need more information, namely do you know the output shape? Because 1000 tiles can be arranged in quite a few ways.

A better way to conceptualise the problem is: how much time are you or your client willing to spend on this? Is it a once off approach, is it quick and dirty, or is it something that needs to be production ready and perfect?

Once off:
I would do the Amazon approach and pay $5 for a wagie in India to do it.
Quick and dirty:
Arrange the tiles in a random order. Define a metric to determine similarity based on edges. (Quick and dirty, I would just use sum of absolute differences on pixel values on each edge). Compute this value for the current arrangement. That is your cost function. Then use simulated annealing to move shapes around to minimise cost function. You should end up with something that looks like an original
Production ready:
As others have said, you would need a machine learning approach that can evaluate whether the tiles are matching based on context.
 
Last edited:
If the images are indeed landscapes, or images with similar properties then the simplest solution I outlined above would likely work pretty well.

If it’s general images, it’s much harder. Imagine a face on a black background. So many tile edges are just black, so how does one match that up using a greedy algorithm? What if there is a sharp boundary in the image that aligned exactly with a vertical or horizontal edge (very unlikely in a landscape, but much more likely in a city scale). So you need something more complex.

Well, you can just swap 1 black tile with another, that's not a problem. If however there was a vertical line then if the image is square it could easily become a horizontal line.

I don't think you'll be able to do this on complex aligned images without using some image recognition to for example find an up position for some tiles and then to work from those.
 
If the images are indeed landscapes, or images with similar properties then the simplest solution I outlined above would likely work pretty well.

If it’s general images, it’s much harder. Imagine a face on a black background. So many tile edges are just black, so how does one match that up using a greedy algorithm? What if there is a sharp boundary in the image that aligned exactly with a vertical or horizontal edge (very unlikely in a landscape, but much more likely in a city scale). So you need something more complex.

Well, you can just swap 1 black tile with another, that's not a problem. If however there was a vertical line then if the image is square it could easily become a horizontal line.

I don't think you'll be able to do this on complex aligned images without using some image recognition to for example find an up position for some tiles and then to work from those.

I'm gonna give it a crack and see what I can do. Make take a week or two or longer but I'm keen to get this working. I'll post what I have as I go along.
 
Well, you can just swap 1 black tile with another, that's not a problem. If however there was a vertical line then if the image is square it could easily become a horizontal line.

I don't think you'll be able to do this on complex aligned images without using some image recognition to for example find an up position for some tiles and then to work from those.
The "simple" algorithm I outlined matches up edges of tiles, so the issue is that in the example, even non-black tiles could have black edges. This means that one would likely need to do some sort of combinatorial (tree) search to get passable results, rather than use a greedy algorithm, which is a bit more complex.
 
Back
Top