不使用任何内置函数从数组捕获重复项

我被赋予了一项任务，编写自己的实现，从数组中删除重复对象。数组未排序

作为一个例子，我有一个对象数组

ItemsList[] objects = {
new ItemsList("ob1"),
new ItemsList("ob2"),
new ItemsList("ob2"),
new ItemsList("ob1"),
new ItemsList("ob3")
};
"ob1" stands for itemId

我的目标是获得像这样的结果数组["ob1", "ob2", "ob3"]，但给定NullPointerException时，试图找到对象没有加倍，并将其添加到数组。

注意:不能使用Set, HashSet, ArrayList, Arrays。copyOf, Sort等或任何其他工具，如迭代器。

到目前为止，我已经完成了:

public String[] removeDuplicates(ItemsList[] objects) {
String[] noDubs = new String[objects.length];
int len = objects.length;
int pos = 0;
for (int i = 0; i < len; i++) {
for (int j = i + 1; j < len; j++) {
if (objects[i].getItemId().equals(objects[j].getItemId())) {
noDubs[pos] = objects[i].getItemId();
pos++;
}
else {
//NullPointerException given
if (!objects[i].getItemId().equals(objects[j].getItemId()) && !objects[i].getItemId().contains(noDubs[i])) {
noDubs[pos] = objects[i].getItemId();
pos++;
}
}
}
}
String[] result = new String[pos];
for(int k = 0; k < pos; k++) {
result[k] = noDubs[k];
}
return result;
}

getItemId是类ItemsList方法

这是另一个选项。这将在第一次找到条目时将它们复制到缓冲区中，然后将缓冲区复制到正确大小的数组中。

在缓冲区中查找重复项，而不是在原始数组中查找是有好处的。如果有很多重复项，那么在缓冲区中查找的检查次数将比在原始数组中查找的检查次数少。

我取出循环来检查一个项目是否在另一个函数的缓冲区中。这样可以避免嵌套for循环，从而使其更易于阅读。

我认为这种整体方法还减少了跟踪所需的变量数量，这也有助于使其更易于阅读。

private static ItemsList[] removeDuplicates(ItemsList[] arr) {
ItemsList[] buffer = new ItemsList[arr.length];
int bufferLength = 0;

for (ItemsList candidate : arr) {
if (!isInBuffer(candidate, buffer, bufferLength)) {
buffer[bufferLength] = candidate;
bufferLength++;
}
}   

ItemsList[] result = new ItemsList[bufferLength];
for (int i = 0; i < bufferLength; i++) {
result[i] = buffer[i];
}
return result;
}
private static boolean isInBuffer(ItemsList candidate, ItemsList[] buffer, int bufferLength) {
for(int i = 0; i < bufferLength; i++) {
if (Objects.equals(candidate, buffer[i])) {
return true;
}
}
return false;
}

下面是一个实现。如果该项等于下一个元素之一，则将其标记为重复项。稍后将其添加到输出数组中，该数组将需要增长。

public String[] removeDuplicates(ItemList[] items) {
String[] output = {};
for (int i = 0; i < items.length; i++) {
boolean isDuplicated = false;
ItemList current = items[i];            
if (current == null)
throw new RuntimeException("item can not be null");
for (int j = i + 1; j < items.length; j++) {
if (current.equals(items[j])) {
isDuplicated = true;
break;
}
}
if (!isDuplicated) {
String[] temp = new String[output.length + 1];
System.arraycopy(output, 0, temp, 0, output.length);
temp[output.length] = current.getItemId();
output = temp;
}
}
return output;
}

您还可以将每个重复的元素设置为null，然后将每个非null元素添加到输出数组中，如下所示:

public String[] removeDuplicates2(ItemList[] items) {
String[] temp = new String[items.length];
int inTemp = 0;
for (ItemList item : items) {
boolean isDuplicated = false;
if (item == null)
throw new RuntimeException("item can not be null");
for (int i = 0; i < inTemp; i++) {
if (item.getItemId().equals(temp[i])) {
isDuplicated = true;
break;
}
}
if (!isDuplicated) temp[inTemp++] = item.getItemId();
}
String[] output = new String[inTemp];
System.arraycopy(temp, 0, output, 0, inTemp);
return output;
}

但请注意，第一个解决方案更快

最好使用布尔值的中间数组来跟踪重复项并定义结果数组的长度。当同一元素可能出现超过2次时，这将有助于检测多个重复。

此外，我们需要确保equals(可能还有hashCode)方法在ItemList中被正确覆盖:

// class ItemList
public boolean equals(Object o) {
if (null == o || !(o instanceof ItemList)) {
return false;
}
if (o == this) return true;
ItemList that = (ItemList) o;
return Objects.equals(this.itemId, that.itemId);
}
public int hashCode() {
return Objects.hash(this.itemId);
}

当检查ItemList是否相等时，最好使用Objects.equals来处理null的值，而不要抛出NullPointerException。因此，输入items中的重复null条目也将被过滤掉。

public static String[] removeDuplicates(ItemList[] items) {
final int n = items.length;
if (n < 1) {
return new String[0];
}
boolean[] dups = new boolean[n];
int dupCount = 0;
for (int i = 0; i < n; i++) {
ItemList current = items[i];
for (int j = i + 1; j < n; j++) {
if (dups[j]) {
continue;
}
if (Objects.equals(current, items[j])) {
dups[j] = true;
dupCount++;
}
}
}
String[] output = new String[n - dupCount];
for (int i = 0, j = 0; i < n; i++) {
if (!dups[i]) {
output[j++] = null == items[i] ? "<NULL>" : items[i].getItemId();
}
}
// info message
System.out.printf("Found and removed %d duplicate value%s%n", dupCount, dupCount != 1 ? "s" : "");
return output;
}

测试:

ItemList[] items = {
null, new ItemList("ob1"), new ItemList("ob2"), new ItemList("ob2"), new ItemList("ob1"),
new ItemList("ob3"), null, new ItemList("ob3"), new ItemList(null), new ItemList("ob5"),
new ItemList("ob2"), new ItemList(null), new ItemList("ob4"), new ItemList("ob5"), null,
};
System.out.println(Arrays.toString(removeDuplicates(items)));
// compare removal of duplicates to using set
System.out.println("nUsing Set");
Set<ItemList> set = new LinkedHashSet<>(Arrays.asList(items));  
System.out.println(set);

输出:

Found and removed 8 duplicate values
[<NULL>, ob1, ob2, ob3, null, ob5, ob4]
Using Set
[null, {id=ob1}, {id=ob2}, {id=ob3}, {id=null}, {id=ob5}, {id=ob4}]

相关内容

最新更新

热门标签：